Skip to main content

With the rapid growth of digital data generated from diverse sources that include devices, there are endless possibilities for innovative solutions. As a result, Data Science has become an important investment area with a variety of use cases. Organizations across industries are focused on leveraging this treasure trove of data to solve complex problems by setting up their own data science practice.

As a culmination of their years of experience in this discipline, Mr. Vineet Raina and Mr. Srinath Krishnamurthy have authored a book covering various data science aspects in depth. GS Lab | GAVS conducted an interactive session to discuss their book “Building an Effective Data Science Practice”. The link to the entire session is available at the end of the blog.

Mr. Vineet Raina is the Chief Data Scientist at GS Lab | GAVS, where he led the effort to set up a Data Science practice. With over 17 years of experience, he has been associated with multiple Data Science projects and has 2 US patents in his name. Mr. Srinath Krishnamurthy, the Principal Architect at GS Lab | GAVS is TOGAF9-certified with 17 years of experience in data mining, predictive modeling, and analytics.

Data Science Practice and Software Development Practice – The Difference

The main focus of a data science team is to adopt scientific practices, such as applying scientific methods to problems, and being able to hypothesize and design experiments to validate those hypotheses. The thought processes behind this kind of scientific approach differ significantly from what happens in software development. In software development, the focus is on creating the product or service. Once the functional and non-functional requirements are finalized, developers focus on writing code to fulfill those requirements.

Another key difference between data science and software development practices is that data science depends entirely on access to good-quality data. It is interdisciplinary and has a lot of dependencies on other teams. Also, data science involves iterative processes as it takes a scientific approach to solving a problem. This involves continuously improving the models and correspondingly, the metrics.

Problems that Data Science Can Solve

Data science has an endless variety of applications. For example, industries that require the storage of huge volumes of liquids in large tanks can use data science to predict their behavior or determine when certain activities such as heating or cooling need to be performed on the liquids. Data science models can also be used for operational optimizations like appropriately scheduling certain activities to reduce overall consumption. Oil and gas companies can for instance, use it to predict sales at gas stations to optimize inventory dispatch based on their needs.

Data science can also be used to add useful features to a product. For example, video conference tools can reduce background noise by implementing models that can suppress noise in real-time. In healthcare, data science models can be used to detect underlying health conditions such as Asthma or Alzheimer’s.

Types of Data Scientists

Broadly, data scientists follow one of two different types of thought processes. The monastic approach is where the data scientist tries to understand the truth underlying the data, such as the relationships among variables or the underlying processes that caused these observations. After understanding the entire truth, they create models based on mathematical equations to predict future observations.

In the other approach, data scientists focus on predictive accuracy. They are not particularly interested in long-term truths or what caused the observations. As new data comes in, these scientists continuously train their models to meet expectations.

While these are two approaches, what is chosen would entirely depend on the project. Factors such as domain and business requirements play a key role in deciding which approach to take.

Skills to Become a Successful Data Scientist

There are four major skills one needs to master to become a data scientist. First, one has to be strong in mathematics and statistics. These skills must be applied on a day-to-day basis even while using software. The next important skill is software programming, as data scientists must write the code used in the software. The third skill is the ability to understand domains in depth. This skill is crucial for building the right model because building a model that actually works is critical to the outcome. Finally, it is also important to have a scientific temperament. Data scientists need to conduct experiments every day, change tracks as required and keep going despite facing failures.

This blog offers only a high-level gist of the interactive session. You can watch the full video here.

The book offers valuable insights into the authors’ learnings over the years, current trends, real-life data science-based products and solutions. The book is definitely worth your time irrespective of whether you are just starting your data science journey or are already well ahead on that road. You can purchase the book here.

The GS Lab | GAVS Data Science practice helps you maximize value from your data and build data-driven systems from the ground up. We help hi-tech products develop their data strategy and manage their complete data pipeline right from data acquisition to complex machine learning models. We believe the real power of data is realized when we move from data to insights to decisions and onto actions. For more on how we can meet your data science needs, please visit