Did you like how we did? Rate your experience!

4.5

satisfied

46 votes

As an aspiring data scientist, which programming language should I?

R and Python are indeed top choices for data science. A word of caution though. Choose one or the other based solely on your situation and not based on opinion of others. People tend to have strong opinions about programming languages. You may be getting a well-intentioned advice which may not be the best in your particular situation. Here is a quick overview of why you would choose one or the other. Why R?R has been used for statistical computing for over two decades now. You can get started with writing useful code in no time. It has been used extensively by data scientists and has an insane number of packages available for a lot of data science related tasks. I have almost always been able to find a package in R to get the task done very quickly. I have decent python skills and have written production code in python. Even with that, I find R slightly better for quickly testing out ideas, trying out different ways to visualize data and for rapid prototyping work. Why Python?Python has many advantages over R in certain situations. Python is a general purpose programming language. Python has libraries like pandas, numpy, scipy and scikit-learn, to name a few which can come in handy for doing data science related work. If you get to point where you have to showcase your data science work, Python once would be a clear winner. Python combined with django is an awesome web application framework, which can help you create a web service/site with both your data science and web programming done in the same language. You may hear some speed and efficiency arguments from both camps - ignore them for now. If you get to a point when you are doing something substantial enough where speed of your code matters to you, you will probably figure out things on your own. So don't worry about it at this point. ConclusionConsidering that you are a beginner in both data science and programming, and that you have a background in Economics and Statistics, I would lean towards R. Besides being very powerful, Python is without a doubt one of the most friendly programming languages to beginners - but it is still a programming language. Your learning curve may be a bit steeper in Python as opposed to R. You should definitely learn Python, once you are comfortable with R, and have grasped the general concepts data science - which will take some time. You can read What are the key skills of a data scientist? to get an idea of the skill set you will need to become a data scientist. Start with R, transition to Python gradually and then start using both as needed. Both are great for data science but one is better than other in certain situations.

100%
Loading, please wait...