An Empirical Exploration of Python Machine Learning API Usage
- ECU Author/Contributor (non-ECU co-authors, if there are any, appear on document)
- Aleksei Vilkomir (Creator)
- Institution
- East Carolina University (ECU )
- Web Site: http://www.ecu.edu/lib/
Abstract: Machine learning is becoming an increasingly important part of many domains, both inside and outside of computer science. With this has come an increase in developers learning to write machine learning applications in languages like Python, using application programming interfaces (APIs) such as pandas and scikit-learn. However, given the complexity of these APIs, they can be challenging to learn, especially for new programmers. To create better tools for assisting developers with machine learning APIs, we need to understand how these APIs are currently used. In this thesis, we present a study of machine learning API usage in Python code in a corpus of machine learning projects hosted on Kaggle, a machine learning education and competition community site. We analyzed the most frequently used machine learning related libraries and the sub-modules of those libraries. Next, we studied the usage of different calls used by the developers to solve machine learning tasks. We also found information about which libraries are used in combination and discovered a number of cases where the libraries were imported but never used. We end by discussing potential next steps for further research and developments based on our work results.
Additional Information
- Publication
- Thesis
- Language: English
- Date: 2020
- Keywords
- Machine Learning API, Python Machine Learning, Machine Learning exploratory
Title | Location & Link | Type of Relationship |
An Empirical Exploration of Python Machine Learning API Usage | http://hdl.handle.net/10342/8796 | The described resource references, cites, or otherwise points to the related resource. |