Communication Skills | Towards Data Science https://towardsdatascience.com/tag/communication-skills/ The world’s leading publication for data science, AI, and ML professionals. Fri, 11 Apr 2025 18:50:41 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 https://towardsdatascience.com/wp-content/uploads/2025/02/cropped-Favicon-32x32.png Communication Skills | Towards Data Science https://towardsdatascience.com/tag/communication-skills/ 32 32 Learnings from a Machine Learning Engineer — Part 6: The Human Side https://towardsdatascience.com/learnings-from-a-machine-learning-engineer-part-6-the-human-side/ Fri, 11 Apr 2025 18:44:39 +0000 https://towardsdatascience.com/?p=605720 Practical advice for the humans involved with machine learning

The post Learnings from a Machine Learning Engineer — Part 6: The Human Side appeared first on Towards Data Science.

]]>
In my previous articles, I have spent a lot of time talking about the technical aspects of an Image Classification problem from data collectionmodel evaluationperformance optimization, and a detailed look at model training.

These elements require a certain degree of in-depth expertise, and they (usually) have well-defined metrics and established processes that are within our control.

Now it’s time to consider…

The human aspects of machine learning

Yes, this may seem like an oxymoron! But it is the interaction with people — the ones you work with and the ones who use your application — that help bring the technology to life and provide a sense of fulfillment to your work.

These human interactions include:

  • Communicating technical concepts to a non-technical audience.
  • Understanding how your end-users engage with your application.
  • Providing clear expectations on what the model can and cannot do.

I also want to touch on the impact to people’s jobs, both positive and negative, as AI becomes a part of our everyday lives.

Overview

As in my previous articles, I will gear this discussion around an image classification application. With that in mind, these are the groups of people involved with your project:

  • AI/ML Engineer (that’s you) — bringing life to the Machine Learning application.
  • MLOps team — your peers who will deploy, monitor, and enhance your application.
  • Subject matter experts — the ones who will provide the care and feeding of labeled data.
  • Stakeholders — the ones who are looking for a solution to a real world problem.
  • End-users — the ones who will be using your application. These could be internal and external customers.
  • Marketing — the ones who will be promoting usage of your application.
  • Leadership — the ones who are paying the bill and need to see business value.

Let’s dive right in…

AI/ML Engineer

You may be a part of a team or a lone wolf. You may be an individual contributor or a team leader.

Photo by Christina @ wocintechchat.com on Unsplash

Whatever your role, it is important to see the whole picture — not only the coding, the data science, and the technology behind AI/ML — but the value that it brings to your organization.

Understand the business needs

Your company faces many challenges to reduce expenses, improve customer satisfaction, and remain profitable. Position yourself as someone who can create an application that helps achieve their goals.

  • What are the pain points in a business process?
  • What is the value of using your application (time savings, cost savings)?
  • What are the risks of a poor implementation?
  • What is the roadmap for future enhancements and use-cases?
  • What other areas of the business could benefit from the application, and what design choices will help future-proof your work?

Communication

Deep technical discussions with your peers is probably our comfort zone. However, to be a more successful AI/ML Engineer, you should be able to clearly explain the work you are doing to different audiences.

With practice, you can explain these topics in ways that your non-technical business users can follow along with, and understand how your technology will benefit them.

To help you get comfortable with this, try creating a PowerPoint with 2–3 slides that you can cover in 5–10 minutes. For example, explain how a neural network can take an image of a cat or a dog and determine which one it is.

Practice giving this presentation in your mind, to a friend — even your pet dog or cat! This will get you more comfortable with the transitions, tighten up the content, and ensure you cover all the important points as clearly as possible.

  • Be sure to include visuals — pure text is boring, graphics are memorable.
  • Keep an eye on time — respect your audience’s busy schedule and stick to the 5–10 minutes you are given.
  • Put yourself in their shoes — your audience is interested in how the technology will benefit them, not on how smart you are.

Creating a technical presentation is a lot like the Feynman Technique — explaining a complex subject to your audience by breaking it into easily digestible pieces, with the added benefit of helping you understand it more completely yourself.

MLOps team

These are the people that deploy your application, manage data pipelines, and monitor infrastructure that keeps things running.

Without them, your model lives in a Jupyter notebook and helps nobody!

Photo by airfocus on Unsplash

These are your technical peers, so you should be able to connect with their skillset more naturally. You speak in jargon that sounds like a foreign language to most people. Even so, it is extremely helpful for you to create documentation to set expectations around:

  • Process and data flows.
  • Data quality standards.
  • Service level agreements for model performance and availability.
  • Infrastructure requirements for compute and storage.
  • Roles and responsibilities.

It is easy to have a more informal relationship with your MLOps team, but remember that everyone is trying to juggle many projects at the same time.

Email and chat messages are fine for quick-hit issues. But for larger tasks, you will want a system to track things like user stories, enhancement requests, and break-fix issues. This way you can prioritize the work and ensure you don’t forget something. Plus, you can show progress to your supervisor.

Some great tools exist, such as:

  • Jira, GitHub, Azure DevOps Boards, Asana, Monday, etc.

We are all professionals, so having a more formal system to avoid miscommunication and mistrust is good business.

Subject matter experts

These are the team members that have the most experience working with the data that you will be using in your AI/ML project.

Photo by National Cancer Institute on Unsplash

SMEs are very skilled at dealing with messy data — they are human, after all! They can handle one-off situations by considering knowledge outside of their area of expertise. For example, a doctor may recognize metal inserts in a patient’s X-ray that indicate prior surgery. They may also notice a faulty X-ray image due to equipment malfunction or technician error.

However, your machine learning model only knows what it knows, which comes from the data it was trained on. So, those one-off cases may not be appropriate for the model you are training. Your SMEs need to understand that clear, high quality training material is what you are looking for.

Think like a computer

In the case of an image classification application, the output from the model communicates to you how well it was trained on the data set. This comes in the form of error rates, which is very much like when a student takes an exam and you can tell how well they studied by seeing how many questions — and which ones — they get wrong.

In order to reduce error rates, your image data set needs to be objectively “good” training material. To do this, put yourself in an analytical mindset and ask yourself:

  • What images will the computer get the most useful information out of? Make sure all the relevant features are visible.
  • What is it about an image that confused the model? When it makes an error, try to understand why — objectively — by looking at the entire picture.
  • Is this image a “one-off” or a typical example of what the end-users will send? Consider creating a new subclass of exceptions to the norm.

Be sure to communicate to your SMEs that model performance is directly tied to data quality and give them clear guidance:

  • Provide visual examples of what works.
  • Provide counter-examples of what does not work.
  • Ask for a wide variety of data points. In the X-ray example, be sure to get patients with different ages, genders, and races.
  • Provide options to create subclasses of your data for further refinement. Use that X-ray from a patient with prior surgery as a subclass, and eventually as you can get more examples over time, the model can handle them.

This also means that you should become familiar with the data they are working with — perhaps not expert level, but certainly above a novice level.

Lastly, when working with SMEs, be cognizant of the impression they may have that the work you are doing is somehow going to replace their job. It can feel threatening when someone asks you how to do your job, so be mindful.

Ideally, you are building a tool with honest intentions and it will enable your SMEs to augment their day-to-day work. If they can use the tool as a second opinion to validate their conclusions in less time, or perhaps even avoid mistakes, then this is a win for everyone. Ultimately, the goal is to allow them to focus on more challenging situations and achieve better outcomes.

I have more to say on this in my closing remarks.

Stakeholders

These are the people you will have the closest relationship with.

Stakeholders are the ones who created the business case to have you build the machine learning model in the first place.

Photo by Ninthgrid on Unsplash

They have a vested interest in having a model that performs well. Here are some key point when working with your stakeholder:

  • Be sure to listen to their needs and requirements.
  • Anticipate their questions and be prepared to respond.
  • Be on the lookout for opportunities to improve your model performance. Your stakeholders may not be as close to the technical details as you are and may not think there is any room for improvement.
  • Bring issues and problems to their attention. They may not want to hear bad news, but they will appreciate honesty over evasion.
  • Schedule regular updates with usage and performance reports.
  • Explain technical details in terms that are easy to understand.
  • Set expectations on regular training and deployment cycles and timelines.

Your role as an AI/ML Engineer is to bring to life the vision of your stakeholders. Your application is making their lives easier, which justifies and validates the work you are doing. It’s a two-way street, so be sure to share the road.

End-users

These are the people who are using your application. They may also be your harshest critics, but you may never even hear their feedback.

Photo by Alina Ruf on Unsplash

Think like a human

Recall above when I suggested to “think like a computer” when analyzing the data for your training set. Now it’s time to put yourself in the shoes of a non-technical user of your application.

End-users of an image classification model communicate their understanding of what’s expected of them by way of poor images. These are like the students that didn’t study for the exam, or worse didn’t read the questions, so their answers don’t make sense.

Your model may be really good, but if end-users misuse the application or are not satisfied with the output, you should be asking:

  • Are the instructions confusing or misleading? Did the user focus the camera on the subject being classified, or is it more of a wide-angle image? You can’t blame the user if they follow bad instructions.
  • What are their expectations? When the results are presented to the user, are they satisfied or are they frustrated? You may noticed repeated images from frustrated users.
  • Are the usage patterns changing? Are they trying to use the application in unexpected ways? This may be an opportunity to improve the model.

Inform your stakeholders of your observations. There may be simple fixes to improve end-user satisfaction, or there may be more complex work ahead.

If you are lucky, you may discover an unexpected way to leverage the application that leads to expanded usage or exciting benefits to your business.

Explainability

Most AI/ML model are considered “black boxes” that perform millions of calculations on extremely high dimensional data and produce a rather simplistic result without any reason behind it.

The Answer to Ultimate Question of Life, the Universe, and Everything is 42.
— The Hitchhikers Guide to the Galaxy

Depending on the situation, your end-users may require more explanation of the results, such as with medical imaging. Where possible, you should consider incorporating model explainability techniques such as LIME, SHAP, and others. These responses can help put a human touch to cold calculations.

Now it’s time to switch gears and consider higher-ups in your organization.

Marketing team

These are the people who promote the use of your hard work. If your end-users are completely unaware of your application, or don’t know where to find it, your efforts will go to waste.

The marketing team controls where users can find your app on your website and link to it through social media channels. They also see the technology through a different lens.

Gartner hype cycle. Image from Wikipedia – https://en.wikipedia.org/wiki/Gartner_hype_cycle

The above hype cycle is a good representation of how technical advancements tends to flow. At the beginning, there can be an unrealistic expectation of what your new AI/ML tool can do — it’s the greatest thing since sliced bread!

Then the “new” wears off and excitement wanes. You may face a lack of interest in your application and the marketing team (as well as your end-users) move on to the next thing. In reality, the value of your efforts are somewhere in the middle.

Understand that the marketing team’s interest is in promoting the use of the tool because of how it will benefit the organization. They may not need to know the technical inner workings. But they should understand what the tool can do, and be aware of what it cannot do.

Honest and clear communication up-front will help smooth out the hype cycle and keep everyone interested longer. This way the crash from peak expectations to the trough of disillusionment is not so severe that the application is abandoned altogether.

Leadership team

These are the people that authorize spending and have the vision for how the application fits into the overall company strategy. They are driven by factors that you have no control over and you may not even be aware of. Be sure to provide them with the key information about your project so they can make informed decisions.

Photo by Adeolu Eletu on Unsplash

Depending on your role, you may or may not have direct interaction with executive leadership in your company. Your job is to summarize the costs and benefits associated with your project, even if that is just with your immediate supervisor who will pass this along.

Your costs will likely include:

  • Compute and storage — training and serving a model.
  • Image data collection — both real-world and synthetic or staged.
  • Hours per week — SME, MLOps, AI/ML engineering time.

Highlight the savings and/or value added:

  • Provide measures on speed and accuracy.
  • Translate efficiencies into FTE hours saved and customer satisfaction.
  • Bonus points if you can find a way to produce revenue.

Business leaders, much like the marketing team, may follow the hype cycle:

  • Be realistic about model performance. Don’t try to oversell it, but be honest about the opportunities for improvement.
  • Consider creating a human benchmark test to measure accuracy and speed for an SME. It is easy to say human accuracy is 95%, but it’s another thing to measure it.
  • Highlight short-term wins and how they can become long-term success.

Conclusion

I hope you can see that, beyond the technical challenges of creating an AI/ML application, there are many humans involved in a successful project. Being able to interact with these individuals, and meet them where they are in terms of their expectations from the technology, is vital to advancing the adoption of your application.

Photo by Vlad Hilitanu on Unsplash

Key takeaways:

  • Understand how your application fits into the business needs.
  • Practice communicating to a non-technical audience.
  • Collect measures of model performance and report these regularly to your stakeholders.
  • Expect that the hype cycle could help and hurt your cause, and that setting consistent and realistic expectations will ensure steady adoption.
  • Be aware that factors outside of your control, such as budgets and business strategy, could affect your project.

And most importantly…

Don’t let machines have all the fun learning!

Human nature gives us the curiosity we need to understand our world. Take every opportunity to grow and expand your skills, and remember that human interaction is at the heart of machine learning.

Closing remarks

Advancements in AI/ML have the potential (assuming they are properly developed) to do many tasks as well as humans. It would be a stretch to say “better than” humans because it can only be as good as the training data that humans provide. However, it is safe to say AI/ML can be faster than humans.

The next logical question would be, “Well, does that mean we can replace human workers?”

This is a delicate topic, and I want to be clear that I am not an advocate of eliminating jobs.

I see my role as an AI/ML Engineer as being one that can create tools that aide in someone else’s job or enhance their ability to complete their work successfully. When used properly, the tools can validate difficult decisions and speed through repetitive tasks, allowing your experts to spend more time on the one-off situations that require more attention.

There may also be new career opportunities, from the care-and-feeding of data, quality assessment, user experience, and even to new roles that leverage the technology in exciting and unexpected ways.

Unfortunately, business leaders may make decisions that impact people’s jobs, and this is completely out of your control. But all is not lost — even for us AI/ML Engineers…

There are things we can do

  • Be kind to the fellow human beings that we call “coworkers”.
  • Be aware of the fear and uncertainty that comes with technological advancements.
  • Be on the lookout for ways to help people leverage AI/ML in their careers and to make their lives better.

This is all part of being human.

The post Learnings from a Machine Learning Engineer — Part 6: The Human Side appeared first on Towards Data Science.

]]>
August Edition: Writing Better as a Data Scientist https://towardsdatascience.com/august-edition-writing-better-as-a-data-scientist-5893196fd3cf/ Mon, 01 Aug 2022 13:31:14 +0000 https://towardsdatascience.com/august-edition-writing-better-as-a-data-scientist-5893196fd3cf/ The benefits of developing your writing skills go from greater visibility as a job candidate to a deeper learning process

The post August Edition: Writing Better as a Data Scientist appeared first on Towards Data Science.

]]>
MONTHLY EDITION

Every data scientist is a writer. Even if you haven’t published with us on TDS, you still share slide decks, reports, and code documentation with your managers, stakeholders, and colleagues on a daily basis. Have you ever encountered a job description that didn’t list "strong Communication Skills" as a requirement? Neither have we.

When you write well about your data science or ML projects, it’s not only your audience that benefits from it—it’s also you, the author. To drive the point home, we’ve collected several recent articles from data practitioners who came to recognize the advantages of developing their writing skills. Scroll down to read about their journeys and to learn from the insights they’ve gained along the way.

If your writerly ambitions go beyond what’s required for your job and you’re thinking about sharing your expertise with a broader professional circle, well, that’s music to our ears. To support more members of our community who want to reap the rewards of public writing, earlier this month we launched Writer’s Workshop: a new series geared towards current and aspiring TDS contributors who’d like to polish their craft. If you’ve yet to read The Art of Promotion, the series’ inaugural post, you’re missing out—it’s a rich resource on building an audience and expanding your reach as a data professional.

Speaking of our authors: if you feel inspired to give them a boost, consider showing your support by becoming Medium members.

TDS Editors


TDS Editors Highlights

  • How Writing on the Internet Can Help You Get a Data Job "I started writing about Data Science for fun. […] But then I realized how helpful those articles I wrote were (and still are) to my career." Otávio Simões Silveira explains how his blogging led to work opportunities, and shares advice on getting started. (July 2022, 5 minutes)

  • My Technical Writing JourneyAfter Writing dozens of articles and attracting readers by the thousands, Xiaoxu Gao reflects on the challenge of converting ideas and problems into blog posts that connect with others. (June 2022, 9 minutes)
  • Writing Can Help You Become a Better Data ScientistGive your motivation a boost and read this succinct overview by Frank Andrade, who explains how improving your writing can lead to a deeper understanding of the concepts you’re unpacking, and helps you establish your authority in the field (something recruiters are likely to notice, too). (July 2022, 4 minutes)
  • A Practical Step for Improving Data Literacy "I would love to see a future where no one is intimidated by the world of data," says Megan Dibble, who argues that the way to reach that goal is to make data science learning resources more readable and easier to understand. (May 2022, 4 minutes)

  • My 6-Step Process for Writing Technical ArticlesIf you feel inspired to finally put quill to paper (or, fine, fingers to keyboard) but find the actual drafting process daunting, Conor O’Sullivan is here to help. His detailed roadmap for technical writers covers everything from idea generation to editing, and will help you push your writing from "hot mess" to "polished gem."

Original Features

The latest crop of articles and resources by the TDS team:

  • The Art of Promotion. Networking, keywords, distribution channels… Our first Writer’s Workshop article covers the essentials of finding—and growing—your audience as a data scientist, on TDS and beyond.
  • Author’s Corner. To help you find resources on writing (and related topics), we’ve created this handy, frequently updated digest, which you’ll also see pinned on our homepage.
  • "A Data Professional without Business Acumen Is Like a Sword without a Handle." Our latest Author Spotlight features Senior Data Analyst Rashi Desai, who shared insights about career paths, finding your purpose as a data professional, and the importance of communicating your results effectively.
  • Summer Reads for Data Scientists. It’s been a very hot stretch more or less everywhere, so if you want to learn something new but really need it to be beach-reading-friendly, we’ve got you covered with a selection of fun, engaging articles.

Popular Posts

In case you missed them, here are some of the most popular articles we published in the past month.


If you’d like to join the ranks of our author community, take the plunge and share your work with us. You’ll be part of a group of smart, accomplished writers—including all the new contributors whose work we published for the first time last month: Quentin Gallea, PhD, Renato Sortino, Stacy Li, Sidney Hough, Edgar A Aguilar, Simon Aubury, Josh Berry, Neha Desaraju, Ajay Halthor, María Bräuner, Soran Ghaderi, Francisco Castillo Carrasco, Pierre Blanchard, TonyM, Leonardo Rigutini, Christian Wanser, Ning Jia, Nick Konz, Phil Bell, Carolyn Wang, Nnamdi Uzoukwu, Felipe González-Pizarro, Andrew Blance, Branden Lisk, Daniel Kulik, Mark Derdzinski, and Othmane Hamzaoui, among others.

The post August Edition: Writing Better as a Data Scientist appeared first on Towards Data Science.

]]>
The Power and Danger of Intuition in Data Science https://towardsdatascience.com/the-power-and-danger-of-intuition-in-data-science-bb5c04471f58/ Thu, 17 Dec 2020 18:07:22 +0000 https://towardsdatascience.com/the-power-and-danger-of-intuition-in-data-science-bb5c04471f58/ Monty Hall’s Assistant The Danger (and Power) of Intuition in Data Science As I start my new journey as a career Data Scientist, I had to cross paths with my old hypothetical nemesis: The Monty Hall Problem. To those of you who have a deep understanding of Bayesian statistics, welcome to Talking to People Who […]

The post The Power and Danger of Intuition in Data Science appeared first on Towards Data Science.

]]>
Monty Hall’s Assistant

The Danger (and Power) of Intuition in Data Science

As I start my new journey as a career Data Scientist, I had to cross paths with my old hypothetical nemesis: The Monty Hall Problem. To those of you who have a deep understanding of Bayesian statistics, welcome to Talking to People Who Hate Math 101 and to those of you who don’t know what I’m talking about, let’s discuss how the human understanding of ‘truth’ and ‘luck’ can lead us astray with the help of a couple of goats.

Data and Communication

This article is meant to be helpful to two kinds of people: those with extensive statistical experience, and those who vomit in their mouth at thought of conditional probabilities. As the field of data science evolves and deepens, as we dive further into the complexities of Machine Learning, we are in danger of creating a Comprehension Gap between data scientists and the stakeholders who employ them. In my experience, C-Level executives tend to lack the intensive math background to understand technical jargon. It isn’t their job to know that stuff. Their job is to lead, to set OKRs, to manage people, to hire you to understand it. Our job, as data scientists, is to be an expert in our field and translate the underlying network of interlocking mathematical theories into bite-size presentations.

A Data Scientist cannot crunch numbers in a vacuum. As mathematicians, we have to both uncover the truth and share the truth. And sharing the truth is typically where this process breaks down.

The Issue with Intuition

Humans are pattern-making beasts. We look to the past to inform our actions. We are also spiteful and chaotic. If you believe that human behavior can truly be perfectly predicted on a granular level, you’re in the wrong field, or you’re going to get a rude wake up call. Here’s a fun example:

The Salesperson – You’ve been hired to analyze the sales patterns at a software company and so you, the diligent data scientist, begin crunching numbers, running ANOVA tests, A/B testing, you even build a beautiful model that predicts, with 80% certainty, that the best time to schedule a follow up call with an account is 17 days after the first email. Your data says this will, on average, increase sales. Done deal, right? You tell the sales team your findings, and if they do that simple thing, call clients at the 17 day mark, their sales will increase. You have the data! The truth is in your hands!

Well, Salesperson A replies with, "Yeah, that’s not how I do it. I like to get in on day 2. Day 17 is too late."

How do you respond? Do you tell them they’re wrong? That the data is clear? That the other salespeople are doing the 17 day rule and they have a higher conversion rate? For some people that might be enough, but Salesperson A trusts their gut. How do you convince them?

The Monty Hall Problem

This is both my favorite and least favorite thought experiment in probability. I’ve always had difficulty understanding probabilities intuitively. Sure, any layman gets that the chance of a 6 of a 6 sided die is 1/6, but ask them what’s the probability of getting a score of 7 on two dice while getting a tails on a coin flip and everyone’s head starts exploding. I have always struggled with Monty Hall, but it’s the best tool to show how misleading intuition can be.

Goat or Car? Monty Hall was a popular game show, but it won’t be remembered for its charming host or musical guests, it will be forever scribed into the annals of Bayesian Statistics because of one dumb game. The set up is seemingly simple: the contestant has a chance to win a new car, but it’s hidden behind one of three doors. Behind the other two doors stands a goat.

(Side note – why goats? Who took care of these goats? Did they have a Monty Hall petting zoo? Were the goats enemies? Friends? Lovers? I have a lot of head cannon questions but I digress).

The game: the contestant picks one of the three doors. After the choice is made, the assistant opens one of the other two doors to reveal a goat. So two doors remain, your door and one more. Which one has the car behind it? Well, you’re given an option: stay with your current door, the door you picked in the first place, the door that spoke to you, the door that the universe said ‘that’s the freaking door!’…or do you change your mind?

If you don’t know the twist, you probably think you want to stay with your first choice right? Why is that? Is it because changing your mind is seen as weak? Is it because you believe that luck is on your side? Those internal, invisible questions are why Monty Hall got to keep playing this game for years without giving away too many cars. They turn a statistical game into a judgement on you and your intelligence. "Of course I picked right the first time," we think and then get to walk home instead of drive.

Probabilistically? You’re wrong. If you play the Monty Hall game and you always stay with your first choice, you will only succeed 1/3 of the time. However, if you always swap, you win 2/3 of the time.

How Intuition Messes Us Up

This choice might feel ‘wrong’ to you. It goes against your gut. Now, I could sit here and explain Bayes theorem and why 2/3 works, but that won’t satisfy your annoyance at being wrong. At this point, you either know the theorem, or you don’t care about it. Those probabilities just don’t feel right. Why not? Let’s break it down.

What’s the chance of picking the right door in the first place? Well, it’s an easy 1 in 3. There’s no way to increase those odds, you’re picking a door and either a car is behind or a goat. But then…a door opens.

What’s the chance of picking the right door a second time? This is where the brain trips up and it’s for two reasons: 1) you aren’t asking to ‘pick a door’ you’re being asked to ‘change your mind’. Anyone who follows politics knows that changing your mind is tantamount to weakness, and so people have no desire to do it. Trap number 2) You see these choices as independent of the first choice. That means, even if you’re asked to pick between two doors, it feels like there’s a 50% chance of success…so both doors are equally likely to have a car, right?

Nope.

Both of these approaches are wrong and you’ll see why just by changing our perspective a little.

New Perspective #1What if there were 10 doors? It’s the same game, but one car and ten doors (which means 9 goats, damn that Monty Hall petting zoo is getting crowded!) You pick a door and then they reveal a goat in one of the other 9 doors. You now have 8 choices left. Do you stay with your 1/10 of a chance? Or do you pick another door? I expect in this version of the game, it would be foolhardy to NEVER examine your first choice. Every open door is more information, it’s another option gone and your chances of getting the correct door increase. This is why the Monty Hall game is played with 3 doors and not 4. Because if you ‘change your mind’ once, you’re apt to change it multiple times. So that dastardly Monty only gives you one chance to change your mind.

This is the essence of Bayesian Statistics. Reverend Thomas Bayes asked the question, "How do I find something if I have no information about it?" Well, you take guesses and see where you landed and then you try again. It is an iterative and cumulative process of gathering more and more experimental data to get you closer to a question that is difficult to answer. This makes sense, right? The more experiments you do and knowledge you glean, the better your choices will be.

New Perspective #2 – What about the assistant? What if we’re looking at this problem all wrong? We’ve been focused on that contestant trying to get a new Honda Civic. What if we consider the POV of the assistant who has to open the doors? I know, it’s not a hard job (unless there are specific goat-related challenges that come up), but the problem suddenly coalesces into something that makes sense when you watch the game from the sidelines.

Let’s say, the contestant picks door 1 and the car is behind door 3. Which door does the assistant remove? Door 2. They can’t open door 1, because you chose that one and that would ruin the game. They can’t open door 3, because, you know, that would be too easy. And so, when the contestant picks the wrong door, the assistant only has 1 out of 3 options to remove.

What if you pick the correct door? Then the assistant has 2 doors to pick from! Which do they choose? Who knows! That must be the exciting part of the job. In this case, they have 2 out of 3 options.

Remember, you only had a 1/3 chance of being right in the first place. Which means, 2/3 times, the assistant only has one option and the remaining door has a car behind it. Therefore, 2/3 times, staying with your first choice is the fool’s choice. The removal of the door is new information and it DOES affect your next choice. The assistant doesn’t remove a door at random, they do it because of an algorithm. That’s why the game is weighted. In essence you’re only picking between two doors, not three, just one has a 1/3 chance of success and the other has 2/3.

Suddenly those probabilities make sense! And it simply required a shift of perspective.

Back to the Salesperson

To the strong-willed, a change of perspectives is a challenge. Convincing someone to go against their gut, even with a wealth of data on hand, is always going to be an uphill battle. Remember, your job as a data scientist isn’t to change someone’s world, it’s to communicate the truth as effectively as possible.

But, maybe, just maybe, you can sew a little doubt to help those headstrong individuals to encourage them to look outside of what ‘works’ to see if they can discover ‘what works better’.

Our job is to discover and communicate truth. We study, read, and blog about the discovery part, but all of that work is nothing without Communication. For many of you starting to code, playing with pandas, exploring GitHub, you’re learning the tools of the trade. But never forget, the most difficult tool to master is how to convince people who just want to feel right.

The post The Power and Danger of Intuition in Data Science appeared first on Towards Data Science.

]]>