• 16. Teaching Data Through Stories
    Jan 12 2022

    The phrase “using data to tell stories” is so commonly used nowadays that it runs the risk of becoming a cliche, if it hasn’t become one already. This episode’s guest flips this logic around - instead of using data to tell stories, he uses stories to teach data science!

    Arvind Venkatadri is a faculty member at Srishti Manipal School of Art, Design and Technology. His research/teaching interests include TRIZ, Computation in R, Design using Open Source Electronics Hardware, and Complexity Science. He is part of the School of Foundation Studies at SMI.

    This is a very wide ranging conversation. We talk about, among other things, The Three Musketeers, Lawrence of Arabia and Legally Blonde. We talk about how Arvind leverages all of these to teach his students data science and logic and game theory.

    At a time when the field of data science is rife with “pile stirring”, where a large section of practitioners treat it as an extension of software engineering, Arvind’s approach, centred on stories and the human experience, is really refreshing. His approach also gives a pointer on how to widen the base in terms of attracting people into data science. 

    I must apologise for one thing - this conversation was recorded during Deepavali in November 2021, so you can occasionally hear the sound of firecrackers in the background. I really hope you can get past that and listen to Arvind’s stories.


    Show Notes

    00:03:00: Arvind’s journey into teaching Data Science in an art school

    00:05:45: Teaching data science to art students

    00:15:45: Teaching statistics through art and stories. Wassily Kandinsky

    00:23:00: Teaching coding through art

    00:31:00: Shapes and colours and emotions

    00:44:00: Lawrence of Arabia (can’t say more here in the description!)

    00:50:00: Data science and the human experience


    Links:

    Arvind’s homepage

    Arvind on Twitter

    Arvind’s course on R for artists and designers

    An intro to Wassily Kandinsky's work


    Data Chatter is a podcast on all things data. It is a series of conversations with experts and industry leaders in data, and each week we aim to unpack a different compartment of the "data suitcase".

    The podcast is hosted by Karthik Shashidhar. He is a blogger, newspaper columnist, book author and a former data and strategy consultant. Karthik currently heads Analytics and Business Intelligence for Delhivery, one of India’s largest logistics companies.

    You can follow him on twitter at @karthiks, and read his blog at noenthuda.com

    Show More Show Less
    55 mins
  • 15. On Data And Journalism
    Dec 22 2021

    There is a conception, or misconception, that journalists are not good at maths. It is rather common to see newspaper headlines and graphics that make basic mathematical and logical errors.

    On the other hand, in the last decade or so, we have seen a massive rise in “data journalism”. With more and more data being available, journalists are able to write stories exclusively based on data.

    How do these two square off?

    To answer this, we have Sukumar Ranganathan, editor in chief of the Hindustan Times. He was previously editor of Mint, of which he was one of the founding editors. It was while he was at Mint that he gave a big push to the then nascent field of “data journalism”, inviting writers such as HowIndiaLives, Rukmini S and myself to write data-backed pieces for Mint. He has previously worked in editorial leadership roles at The Hindu Businessline and Business Today.

    Sukumar has degrees in chemical engineering, maths, and business administration, and is interested in mathematics, science and technology, the history of business, new media, and data-based political journalism. He reads and collects comic books and is an amateur birder. He tweets under the ID @HT_ed

    Show Notes:

    00:03:15: Are journalists really bad at maths?
    00:16:30: Impact of bad data on public policy, and information theory
    00:21:00: How data in journalism has changed in the last 20-25 years
    00:23:00: The data journalism story
    00:31:15: Judging a data story
    00:45:30: Advice to budding data journalists

    Data Chatter is a podcast on all things data. It is a series of conversations with experts and industry leaders in data, and each week we aim to unpack a different compartment of the "data suitcase".

    The podcast is hosted by Karthik Shashidhar. He is a blogger, newspaper columnist, book author and a former data and strategy consultant. Karthik currently heads Analytics and Business Intelligence for Delhivery, one of India’s largest logistics companies.

    You can follow him on twitter at @karthiks, and read his blog at noenthuda.com

    Show More Show Less
    50 mins
  • 14. Programming Data Science: R vs Python
    Oct 19 2021

    There are two dominant programming languages used for data science nowadays - R and Python, each having its own set of loyal users. Both have their own strengths and weaknesses. In this episode, we look at what each langauge is good and bad at, what kind of people are more likely to use each, and how being able to program in both and switch seamlessly can indeed be a superpower.

    Today’s guest is Abdul Majed Raja RS, a Data Scientist at Atlassian. Abdul Majed likes to call himself an Analytics Consultant with over a decade of experience helping organisations solve their business problems. He's also a Content Creator trying to help newcomers navigate the Data Science space easily and learn continuously. You can find him on Twitter and on Youtube at 1littlecoder. 


    Show Notes: 

    00:03:00: How Abdul got into analytics
    00:05:30: MS Excel in data science
    00:07:45: When to use R and when to use Python
    00:17:00: What data scientists can learn from software engineers
    00:24:30: Graphics and visualisations in R and Python
    00:26:45: Machine learning in R and Python
    00:29:15: Why the Indian market in Data Science leans towards Python
    00:34:45: Working with databases
    00:37:30: Building dashboards in R and Python
    00:47:00: Working with R *and* Python at the same time
    00:51:30: What about Excel and Julia?

    Links

    I don't like Notebooks - Joel Grus  - 

    Interface between R and Python - reticulate.

    Julia Silge Youtube Channel for latest Tidymodels tutorials

    Advantages of Using R Notebooks For Data Analysis Instead of Jupyter Notebooks - Max Woolf


    Data Chatter is a podcast on all things data. It is a series of conversations with experts and industry leaders in data, and each week we aim to unpack a different compartment of the "data suitcase".

    The podcast is hosted by Karthik Shashidhar. He is a blogger, newspaper columnist, book author and a former data and strategy consultant. Karthik currently heads Analytics and Business Intelligence for Delhivery, one of India’s largest logistics companies.

    You can follow him on twitter at @karthiks, and read his blog at noenthuda.com

    Show More Show Less
    1 hr and 1 min
  • 13: Is this real, Data Science, or is it a fantasy?
    Oct 12 2021

    Over the last decade, we have seen tremendous advances in big data, data science, artificial intelligence and machine learning. Every compnay wants to be a tech-first comapny now, and wants to “do data science". Companies can probably double their valuation by just adding a  “.ai" to their names. Companies that actually use artificial intelligence and machine learning maybe have an even higher premium on their valuations.

    However, is Data Science worth the hype? Is AI going to take over the world?  And why is data science being eaten by computer science? What happned to classical analytics, operations resarch and statistics?

    This week’s guest is someone who did data science even before the phrase had b een invented.

    Amaresh Tripathy is SVP and Analytics Business Leader at Genpact. Till recently he was a Partner with PWC, leading the firm’s Data & Analytics Consulting, and helped build a $500mm business. Previously, Amaresh founded and co-led the Information and Analytics Practice for Diamond Management & Technology Consultants, and also serves as Adjunct Professor of Data Science and Business Analytics at the University of North Carolina, Charlotte.

    Amaresh has helped Fortune 500 companies in multiple industries (healthcare, retail & consumer, communications) to help define and implement their analytics and AI strategies and institutionalize data enabled decision making.  He has led organizations to help embed analytics in their front, middle and back office functions and manage the change process.


    Show Notes:

    00:03:00: Definitions - data science, artificial intelligence, machine learning, etc.
    00:04:15: The rise of computer science and machine learning
    00:10:15: The probelm with Kaggle, and the “race for accuracy”
    00:11:30: How to scale analytics without doing bad data analysis
    00:18:00: How selling data science has changed over the last decade
    00:23:00: The interaction between business and Data Science
    00:26:30: “Creating bilinguals at scale”
    00:30:30: Machine learning trying to eat data science
    00:39:00: Comparing data science practices across countries

    Links:

    Thomas Davenport and DJ Patil on Data Science as the “sexiest job of the 21st century” (2012 article)

    Hal Varian on statistics as a “sexy job”


    Data Chatter is a podcast on all things data. It is a series of conversations with experts and industry leaders in data, and each week we aim to unpack a different compartment of the "data suitcase".

    The podcast is hosted by Karthik Shashidhar. He is a blogger, newspaper columnist, book author and a former data and strategy consultant. Karthik currently heads Analytics and Business Intelligence for Delhivery, one of India’s largest logistics companies. 

    You can follow him on twitter at @karthiks, and read his blog at noenthuda.com/blog

    Show More Show Less
    46 mins
  • 12. Carts and graphs: Storytelling through maps
    Oct 5 2021

    In this edition of data chatter, we will talk about maps. Maps are excellent devices for telling stories. Think of the maps you see around election times that show which parties won seats where. in fact, the first ever scatter plot - Dr. John Snow’s figure of cholera cases in London, was essentially a map. Or think of the famous map of Napoleon’s invasion of Russia.

    And telling stories through maps is an exercise in data science. Data overlaid on maps can help tell really powerful stories. And as we learn in today’s conversation, the process of mapping is no diferent from the process of data science.

    Our guest is Raj Bhagat Palanichamy, or as he calls himself “mapper for life”. Raj works for the World Resources Insitute India, where he leads projects on urban development, water resources and transport.

    In this conversation, Raj talks about his journey into mapping, how he makes his maps, and how WWE influences the way he tells his stories.


    Highlights:

    00:03:00: Raj's journey into the world of maps and mapmaking
    00:06:15: The process of creating maps to tell stories
    00:12:30: Choosing colours
    00:17:00: The importance of annotations in storytelling
    00:23:15: Data, digitisation and tools
    00:35:45: Taking inspiration from WWE to construct "stories" with maps
    00:42:13: Mapping cities versus mapping landscapes
    00:44:30: Where is mapping underrated and overrated?

    Raj's 30 day map challenge in 2020

    Data Chatter is a podcast on all things data. It is a series of conversations with experts and industry leaders in data, and each week we aim to unpack a different compartment of the "data suitcase".

    The podcast is hosted by Karthik Shashidhar. He is a blogger, newspaper columnist, book author and a former data and strategy consultant. Karthik currently heads Analytics and Business Intelligence for Delhivery, one of India’s largest logistics companies. 

    You can follow him on twitter at @karthiks, and read his blog at noenthuda.com/blog

    Show More Show Less
    51 mins
  • 11. Unknown Unknowns: Risk and Uncertainty
    Sep 14 2021

    The fundamental principle underlying all analytics and data science is Probability. And probability was first invented, or should I say discovered, to assess risk. So what is risk? Can we quantify and measure it? How do we handle risk in life? Is risk always bad?

    Today’s guest on Data Chatter is Bala Vamsi Tatavarthy, who is co-founder and investment advisor at Aravali Asset Management, a global arbitrage fund.

    Vamsi was my classmate at IIT Madras, where he studied computer science but spent most of his time gaming. He then went to IIM Ahmedabad, where he continued to game heavily and graduated with a gold medal. He now runs a hedge fund, and spends a lot of time gaming. 

    Moreover, he was one of the last traders to trade on behalf of Lehman Brothers, on 15th September 2008. 

    Risk, as you can imagine, is a vast subject, and so this is a long podcast. We talk about measuring risk, problems with too much measurement of risk, how risk can be managed, and all that. We also talk about movies, games, the differneces between poker and bridge and physics envy.

    Show Notes

    00:03:45: Defining Risk, and Lehman Brothers’ collapse

    00:09:00: Can risk be created or destoryed? Is it conserved?

    00:15:00: Risk, probability distributions and long tails

    00:20:45: Uncertainty, volatility and risk

    00:28:30: Hedging

    00:35:00: Utility functions

    00:42:30: Games and risk

    00:54:00: Bridge and poker, and finite and infinite games

    01:04:15: Ergodicity

    01:07:30: VaR, Risk-metrics and Goodhart’s Law

    01:14:30: Correlation

    Links:

    Finite and Infinite Games

    “Risk once created cannot be destroyed”

    The Wired article about Gaussian Copula, used to estimate correlations

    Too Big To Fail, by Andrew Ross Sorkin

    Ergodicity Economics 

    Data Chatter is a podcast on all things data. It is a series of conversations with experts and industry leaders in data, and each week we aim to unpack a different compartment of the "data suitcase".

    The podcast is hosted by Karthik Shashidhar. He is a blogger, newspaper columnist, book author and a former data and strategy consultant. Karthik currently heads Analytics and Business Intelligence for Delhivery, one of India’s largest logistics companies. 

    You can follow him on twitter at @karthiks, and read his blog at noenthuda.com/blog

    Show More Show Less
    1 hr and 25 mins
  • 10. From Science To Data Science
    Aug 30 2021

    When I was graduating college in the mid 2000s, the word in job descriptions that most commonly appeared alongside “data” was “analytics”. However, around 2010, the phrase “data science” (HBR link) got coined, and took over the world in the next five years. Nowadays it seems everyone wants to be a “data scientist”

    However, where is the science in data science? And why are so many people with PhDs in pure science moving to data science?

    To understand this better, I bring back one of the old guests of Data Chatter. Dhanya P is an aerospace engineer turned neuroscientiest turned data scientist. She is co-founder of Messy Fractals and Kabaddi Adda, and a Senior Scientist at Sapien Labs. Dhanya talks about her journey from neuroscience to data science, why a PhD is good training for data science, and what the “science” in data science is all about. 

    You can follow Dhanya on Twitter at d2a2d

    Show Notes:

    00:02:30: Dhanya’s journey from Aerospace Engineering to Neuroscience to Data Science
    00:07:00: Why data science and not academia after PhD
    00:11:45: Defining data science, and how she approaches a problem
    00:16:00: How a PhD prepares you for a career in data science
    00:20:00: Challenges in industry due to academic background
    00:23:00: Learning to code
    00:26:50: The challenges of working with someone else’s data, and proxies
    00:37:30: Communicating results
    00:42:45: Are ex-academics better at certain kind of Data Science roles?
    00:46:00: “Entropy” in the brain
    00:51:30: Revisiting the biomechanics of Kabaddi players, and communicating data to sportspersons

    Data Chatter is a podcast on all things data. It is a series of conversations with experts and industry leaders in data, and each week we aim to unpack a different compartment of the "data suitcase".

    The podcast is hosted by Karthik Shashidhar. He is a blogger, newspaper columnist, book author and a former data and strategy consultant. Karthik currently heads Analytics and Business Intelligence for Delhivery, one of India’s largest logistics companies. 

    You can follow him on twitter at @karthiks, and read his blog at noenthuda.com/blog

    Show More Show Less
    1 hr
  • 9. Making Data Science Work
    Aug 16 2021

    Everyone wants to do “data science”. Companies want to introduce “machine learning” in their products. Most fund raises by startups nowadays are accompanied by a statement of intent to invest in data, and data science.

    Back in 2006, mathematician Clive Humby, who was working for Tesco, made the statement that “data is the new oil” (to give context, we were in the middle of a massive bull run in oil prices then). And so companies are investing in data.

    However, just investing in data capture and hiring data scientists is not enough for a company to get value. It is important to structure the relationship between data and business, and how the data team is managed, in the right way for the data team to be effective.

    Today’s guest is Anuj Krishna. Over the last 14 years, Anuj has worked with multiple enterprises on both, the translation side as well as the execution side of analytics. He has helped create standard processes for analytical problem solving that are in use in multiple enterprises.

    Anuj was an early employee of MuSigma, and then went on to co-found TheMathCompany. In his current role, Anuj is Head of Assets at TheMathCompany, and is also responsible for operations related to TheMathCompany.

    Show Notes:

    00:03:00: How business and data science currrently interact

    00:06:30: Translating from analytics to business

    00:13:00: Structuring a data science team

    00:22:00: Data science versus business intelligence

    00:29:00: How can a business person get best value out of a data team?

    00:32:00: Why data science projects fail

    00:38:30: Evolution of the data science industry over the last decade

    Links:

    Anuj Krishna

    TheMathCompany (LinkedIn)

    Data Chatter is a podcast on all things data. It is a series of conversations with experts and industry leaders in data, and each week we aim to unpack a different compartment of the "data suitcase".

    The podcast is hosted by Karthik Shashidhar. He is a blogger, newspaper columnist, book author and a former data and strategy consultant. Karthik currently heads Analytics and Business Intelligence for Delhivery, one of India’s largest logistics companies. 

    You can follow him on twitter at @karthiks, and read his blog at noenthuda.com/blog

    Show More Show Less
    49 mins