Article From:

Recently, KDnuggets website published the survey results of data science and machine learning tools in 2018. More than 2300 participants voted for “data mining / machine learning tools and programming languages used in project development over the past 12 months”.

The most popular tools for analysis, data science, machine learning

Machine learning tools list! Python is the most powerful! R no longer!


Figure 1:2018, the most popular analysis / Data Science / machine learning tool in the year of 1:2018, and comparison with the findings of the survey conducted in the year 2000.

The top 11 most popular tools are listed below, accounting for more than 20% of each.

Machine learning tools list! Python is the most powerful! R no longer!


Table 1:2018 most popular analysis / Data Science / machine learning software Top 10

In the table above, 2018% share refers to the percentage of people who use this tool for all voters, and% change refers to the change of voting in 2018 compared with 2017.

The average number of tools used by each interviewee was 7, slightly higher than that of 6.75 in 2017 (excluding 1 votes only).

Compared with the software survey in 2017, Keras entered the new Top 10 this year.

PythonReplacing R as the most popular programming language

The survey showed that Python has accounted for more than 50% of the 2017 survey, and increased to 66% this year, while the R language has fallen to less than 50% for the first time since the start of the survey (this year’s nineteenth session).

RapidMinerGreatly increased popularity

In the past several surveys, RapidMiner is the highest ranked data science platform, and its proportion increased from 33% in 2017 to 50% this year. However, this is because RapidMiner has taken some measures to encourage their users to take part in the survey.

SQLRankings remain stable

SQL,Including Spark SQL and SQL to Hadoop tools, about 40% of the votes in each of the past 3 votes have been accounted for. So, if you are a data scientist, learn SQL — it will probably be useful for a long time.


The following table lists the tools with a rate of increase of 20% and above, and a utilization rate of more than 3% in 2018.

Machine learning tools list! Python is the most powerful! R no longer!


Table 2: the main analysis / Data Science / machine learning tool with the largest rate of increase in usage

We note that of the 56 tools that had reached 2% or higher in 2017, 19 (about 1/3) had increased use in 2018, while the other 37 decreased. This and recent acquisitions (Datawatch acquisition of Angoss, MiNitab’s acquisition of Salford together indicates that the integration of data science platform is in progress.

The following table enumerates that at least 3% of the utilization rate in 2017 is down by 25% or more this year.

Machine learning tools list! Python is the most powerful! R no longer!


Table 3: the main analysis / data science tool with the largest utilization rate.

Deep learning tools

The survey results show that the use ratio of deep learning tools has remained stable in recent years. In this year’s survey, 33% of voters used deep learning tools, and the ratios in 2017 and 2016 were 32% and 18% respectively.

Google’s TensorFlow is still the most popular deep learning platform at present, but the utilization rate of Keras is also very high, close to TensorFlow.

PyTorchAt third, the use rate is 6.4%. However, more readers of KDnuggets are in the field of data science, and this data may not fully reflect the true popularity of these deep learning tools in the community. PyTorch has had a big upgrade this year, andCombined with Caffe 2, it is expected that its utilization rate will be higher in the future.

Depth learning tool rankings:

Tensorflow, 29.9%

Keras, 22.2%

PyTorch, 6.4%

Theano, 4.9%

Other Deep Learning Tools, 4.9%

DeepLearning4J, 3.4%

Microsoft Cognitive Toolkit (Prev. CNTK), 3.0%

Apache MXnet, 1.5%

Caffe, 1.5%

Caffe2, 1.2%

TFLearn, 1.1%

Torch, 1.0%

Lasagne, 0.3%

Large data tools: the use of Hadoop has declined

In this year’s survey, about 33% of the voters used large data tools, either Hadoop or Spark – the ratio was roughly the same as in 2017, but the use of Hadoop dropped significantly – about 30%.

The detailed results are as follows:

Machine learning tools list! Python is the most powerful! R no longer!


Programing language

PythonReplacing R is the most popular programming language for data science / machine learning developers, and is much higher than other programming languages. The ranking of SQL, Java and C/ C++ remains the same.

This is the first decline in the utilization rate of R since the KDnuggets website started the survey. The use of other programming languages has also declined.

The following are the main programming languages that are ranked according to popularity.

Python, 65.6% (2017The year is 59%), 11% rises

R, 48.5% (2017The year is 56.6%), the 14% decline

SQL, 39.6% (2017The year is 39.2%), 1% rises

Java, 15.1% (2017The year is 15.5%), the 3% decline

Unix, shell/awk/gawk, 9.2% (2017The year is 10.8%), the 15% decline

Other programming and data languages, 6.9%, (2017The year is 7.6%), -9% decline

C/C++, 6.8%, (2017The year is 7.1%), the 3% decline

Scala, 5.9%, (2017The year is 8.3%), the 29% decline

Perl, 1.0% (2017The year is 1.9%), the 46% decline

Julia, 0.7% (2017The year is 1.2%), the 45% decline

Lisp, 0.3% (2017The year is 0.4%), -25% decline

Clojure, 0.2% (2017The year is 0.3%), -38% decline

F, # 0.1% (2017The year is 0.5%), -73% decline

The release time of the original text is: 2018-06-1

The author of this article: Xiao Qin

Welcome to my blog or public address: Python learning and communication.

Welcome to join my 1000 people exchange Q & a group: 125240963


Leave a Reply

Your email address will not be published. Required fields are marked *