Despite the fact that many people today advocate the good side of the profession of data scientist/data analyst, there are plenty of other parts of this profession that many do not tell you and that you should know before you start in this profession, it is very important to know all the aspects in fact before even starting a career there. Because it can help you make better decisions.
so let’s start with the fact that the job is not really new, it’s a profession that has existed since 2000 it is just statistics that we finally rebranded it to something more appealing, people take it as such a new thing but years ago people were already doing it BI tools or with statistics and stuff like that so it’s just an old job that we rebranded to give it a sexier in 2021. So in the end it is basically the same thing. As a data analyst for example you are basically doing statistics. The software you are using implement must of the hard part but you still need to master the basics in order to understand what you are doing, or just to choose the best method related to the problem you are trying to solve.
Secondly, the biggest part of these jobs is not to take the data already clean and ready to be used. Solving Kaggle problems has almost nothing to do with your future day-to-day job. In data science, the hardest part is usually the data collection and cleaning that you must do before starting the analysis. Any ML model or Business analyst task needs data. And no one will give you this data in most companies. And let me make it more clear this data can be real garbage that you have to collect, clean and aggregate yourself.
These data can be on paper, excel sheets, CMS, online surveys, some really old CRM or ERP, and SQL databases. And you have to be able to retrieve those data, Understand the correlation between them, and clean and format them before starting your analytics or even thinking about the model. And unfortunately, this process takes a huge amount of time. Data collection is almost 80% of data science jobs. Honestly, if data science was just writing some notebooks and tweaking some parameters to get better accuracy (a.k.a the Kaggle way), it would have been automated since then.
The other part of this incredible job is that you will not be the nerd guy behind your computer all day. instead, you will have to talk a lot, present your results to the management, and speak with different departments either to understand the problem you are solving, get data or just communicate the results of your analyses.
it’s a job that requires you to be a lot in meetings, and presentations because in fact what you are actually doing is analysing the data to give directions to the management or marketing or whoever so that they can make better decisions and for that, you will have to present what you are doing, explain the graphs, how what you did could be implemented and why it is even useful.
That is the reason why I think data scientists do portfolios the wrong way. instead of focusing on code or GitHub or whatever, it is more important to show and explain the finding, the results, and the process of your work. for example, if you are doing a machine learning model, instead of saying I know how to plot a line, show instead that this model could predict something that can increase a business investment for example. Because it is what businesses are looking for: your ability to help them make better decisions. I think I will do a separate story about data scientist portfolios.
So Presentations and meetings will be a big part of your work.
Another problem with the job is that people who begin in the field with some boot camp certificate or even a bachelor, or master’s degree do not realize that it will take a lot of time to find a stable job. What we keep hearing recently is that it is a new job and it pays well but you may need, a good portfolio, experience and even a PHD with some publications in order to get a job. because people think it is like web development that you could learn in 3 weeks, write a cv and get hired in 1 month. It may be a bit more challenging.
anyway, let’s assume you don’t need a solid background. As an ML engineer for example you will face the sad truth. It is that almost no one knows what he is doing in this domain. A lot of processes and techniques are new and keep changing every time. And you have to be up to date will all that by maybe reading some research papers and recent publications. For example, recently I was trying to build an inference model on a graph database. After some research, So I scheduled meetings with some professors and they told me that the domain is pretty new and they can’t give me solid advice. I can just try many things and see. I was a bit surprised but that is how things go there. you may find out that maybe graph neural networks are not mature and you will have to read research papers to find the latest process to solve a problem.
I think I have introduces too many difficulties with the domain. We still have some good parts though. For example, as data science is a new area, we still have a lot of opportunities to innovate in the domain and maybe future jobs will be created in the upcoming years. So it is an exciting domain to be part of. I can also highlight the fact that being near the management gives you a huge power in the company. You can have a noticeable impact on the business because you are kind of supporting the decisions they are making and giving some directions to the company. And also in some areas (not everywhere, unfortunately), the salaries are really appealing. It may be a good motivation to start in the domain.
that’s basically what I wanted to present today, just wanted to share some things you may want to know before starting in the field.
By the way, if you are interested and want to learn data science don’t hesitate to visit https://ulife.school/class/path/data-science for a free course
We have the knowledge and the infrastructure to build, deploy and monitor Ai solutions for any of your needs.
Contact us