A case study : Influence of dimension reduction on regression trees-based algorithms - Predicting aeronautics loads of a derivative aircraft
[Étude de cas : Influence des réductions de dimension sur les modèles d’apprentissage d’arbres de régression - Prédiction de courbes de charges aéronautiques pour un avion dérivé]
Journal de la société française de statistique, Tome 159 (2018) no. 3, pp. 56-78.

Dans l’industrie aéronautique, les besoins du marché évoluent rapidement dans un contexte de forte concurrence. Ceci nécessite d’adapter un modèle d’avion donné en temps minimum considérant par exemple un incrément du rayon d’action ou du nombre de passagers (voir famille A320 NEO). Le calcul de charges et de structure pour redimensionner la cellule est sur le chemin critique de la définition de cette variante avion : c’est un processus chronophage et coûteux, une des raisons étant la grande dimension et la grande quantité de données. C’est pourquoi Airbus a investi depuis 2 ou 3 ans dans des approches de données massives (des méthodes statistiques jusqu’au machine learning) pour améliorer la vitesse, l’extraction de valeur et la réactivité de ce processus. Cet article présente des avancées récentes dans ce travail fait en collaboration entre Airbus, l’ENAC et l’Institut Mathématique de Toulouse dans le cadre d’une étude de validation sous la forme un projet de type sprint. Il compare l’influence de trois techniques de réduction dimensionnelle (ACP, interpolation polynomiale, combiné) sur les capacités d’extrapolation d’algorithmes basés sur les arbres de régression pour la prédiction des charges. Il montre que AdaBoost avec Forêts Aléatoires offre des résultats prometteurs en moyenne en termes de précision et temps de calcul pour estimer des charges sur lesquelles une ACP est appliquée sur les sorties.

In aircraft industry, market needs evolve quickly in a highly competitive context. This requires adapting a given aircraft model in minimum time considering for example an increase of range or the number of passengers (cf A330 NEO family). The computation of loads and stress to resize the airframe is on the critical path of this aircraft variant definition: this is a consuming and costly process, one of the reason being the high dimensionality and the large amount of data. This is why Airbus has invested since a couple of years in Big Data approaches (statistic methods up to machine learning) to improve the speed, the data value extraction and the responsiveness of this process. This paper presents recent advances in this work made in cooperation between Airbus, ENAC and Institut de Mathématiques de Toulouse in the framework of a proof of value sprint project. It compares the influence of three dimensional reduction techniques (PCA, polynomial fitting, combined) on the extrapolation capabilities of Regression Trees based algorithms for loads prediction. It shows that AdaBoost with Random Forest offers promising results in average in terms of accuracy and computational time to estimate loads on which a PCA is applied only on the outputs.

Keywords: Regression trees, Aeronautics, Dimensional reduction, Extrapolation
Mot clés : Arbres de régression, Aéronautique, Réduction de dimension, Extrapolation
@article{JSFS_2018__159_3_56_0,
     author = {Fournier, Edouard},
     title = {A case study : {Influence} of dimension reduction on regression trees-based algorithms - {Predicting} aeronautics loads of a derivative aircraft},
     journal = {Journal de la soci\'et\'e fran\c{c}aise de statistique},
     pages = {56--78},
     publisher = {Soci\'et\'e fran\c{c}aise de statistique},
     volume = {159},
     number = {3},
     year = {2018},
     mrnumber = {3901136},
     zbl = {1410.62214},
     language = {en},
     url = {http://www.numdam.org/item/JSFS_2018__159_3_56_0/}
}
TY  - JOUR
AU  - Fournier, Edouard
TI  - A case study : Influence of dimension reduction on regression trees-based algorithms - Predicting aeronautics loads of a derivative aircraft
JO  - Journal de la société française de statistique
PY  - 2018
SP  - 56
EP  - 78
VL  - 159
IS  - 3
PB  - Société française de statistique
UR  - http://www.numdam.org/item/JSFS_2018__159_3_56_0/
LA  - en
ID  - JSFS_2018__159_3_56_0
ER  - 
%0 Journal Article
%A Fournier, Edouard
%T A case study : Influence of dimension reduction on regression trees-based algorithms - Predicting aeronautics loads of a derivative aircraft
%J Journal de la société française de statistique
%D 2018
%P 56-78
%V 159
%N 3
%I Société française de statistique
%U http://www.numdam.org/item/JSFS_2018__159_3_56_0/
%G en
%F JSFS_2018__159_3_56_0
Fournier, Edouard. A case study : Influence of dimension reduction on regression trees-based algorithms - Predicting aeronautics loads of a derivative aircraft. Journal de la société française de statistique, Tome 159 (2018) no. 3, pp. 56-78. http://www.numdam.org/item/JSFS_2018__159_3_56_0/

[1] Airbus, Commercial Aircraft A330 Family (2017) (Available from : http://www.aircraft.airbus.com/aircraftfamilies/passengeraircraft/a330family/)

[2] Breiman, Leo; Friedman, Jerome; Olshen, R.A.; Stone, Charles J. Classification and Regression Trees, Wadsworth, Belmont, CA, 1984 | MR | Zbl

[3] Breiman, Leo Random Forests, Machine Learning, Volume 45 (2001), pp. 5-32 | Zbl

[4] Breiman, Leo Bagging Predictors, Machine Learning, Volume 24 (1996), pp. 123-140 | Zbl

[5] Breiman, Leo Arcing the Edge, Technical Report (1997) no. 486 (Statistics Department, University of California) | Zbl

[6] Doherty, D. Analytical Modeling of Aircraft Wing Loads Using MATLAB and Symbolic Math Toolbox (2009)

[7] Drucker, H. Improving Regressors using Boosting Techniques, Proceedings of the Fourteenth International Conference on Machine Learning (1997), pp. 107-115

[8] Friedman, J. H. Greedy Function Approximation: A Gradient Boosting Machine (1999) | MR | Zbl

[9] Freund, Y.; Shapire, R.E. A decision-theoretic generalization of on-line learning and application to boosting, Proceedings of the second European Conference on Computational Learning Theory (1995), pp. 23-37 | MR | Zbl

[10] Freund, Y.; Shapire, R.E Experiments with a new boosting algorithm, Machine Learning, Proceedings of the Thirteenth Conference (1996), pp. 148-156

[11] Gandomi, Amir; Haider, Murtaza Beyond the hype: Big Data concepts, methods, and analytics, Internation Journal of Information Management, Volume 35 (2015), pp. 137-144

[12] Hjelmstad, Keith D. Fundamentals of Structural Mechanics, Springer US, 2005

[13] Hoblit, Frederic M. Gust Loads on Aircraft: Concepts and Applications, AIAA Education Series, AIAA, 1988

[14] Hotelling, H. Analysis of a Complex of Statistical Variables Into Principal Components, Journal of Educational Psychology, Volume 23 (1993), p. 417-441 and 498-520 | JFM

[15] Li, Cheng A Gentle Introduction to Gradient Boosting, 2016 (College of Computer and Information Science, Northeastern University. Available from: http://www.ccs.neu.edu/home/vip/teach/MLcourse/4_boosting/slides/gradient_boosting.pdf)

[16] Manyika, J.; al. Big Data: the next frontier for innovation, competition and productivity (2011) (Mc Kinsley Global Institute)

[17] Pearson, K. On lines and planes of closest fit to systems of points in space, Philosophical Magazine, Volume 2 (1901) no. 11, pp. 559-572 | JFM

[18] Quinlan, J.R. Programs for machine learning, M. Kaufmann, 1993

[19] Sergienko, E.; Gamboa, F.; Busby, F. Shape invariant model approach for functional data analysis in uncertainty and sensitivity studies (2012)

[20] Segal, Mark; Xiao, Yuanyuan Multivariate Random Forests, John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov, Volume 1 (2011), pp. 80-87 (DOI: 10.1002/widm.12)

[21] Torenbeek, E; Wittenberg, H Flight Physics: Essentials of Aeronautical Disciplines and Technology, with Historical Notes, Springer, 2009

[22] Wikistat Arbres binaires de décision — Wikistat, 2016 (Available from: http://wikistat.fr/pdf/st-m-app-cart.pdf)