Cake
  • Log In
  • Sign Up
    • Thanks for taking the time to reply and for the constructive feedback, Chris. I think there are several questions that I would love to discuss here in a panel format.  

      DS

      How prevalent is data science in your local ecosystem?  (You may have seen the report that came out last month that 80% of data scientists work at Facebook, Amazon, Netflix or Google.)

      Artificial Intelligence and Machine Learning

      Is the “iterate till it works” process the norm in AI?  (It certainly feels that way when you read the news of accidents with Tesla’s assisted driving mode.)

      Marketing

      What’s your experience in using data science/data analysis in marketing?  (A/B testing, Natural Language Processing, recommender systems)

      Data Engineering/Data Visualization

      What are the new tools worth considering over the established ones? (Pytorch versus Tensor Flow, Google Data Studio versus Tableau, Scala versus Python.)

      Hope the above clarity helps.

    • Big Data

      Thank you for the invitation, it’s been a long time since I last was here. Coincidentally I’m currently working on AI/ML related project and have been trying to learn more on the subject of Data Science as well. Here are some contributions:

      1) “How prevalent is data science in your local ecosystem?” - our project is in education, trying to improve all aspects of the E-learning process through ML. Doing a research of the current landscape doesn’t show many results. Coursera seems to be more active with some efforts in the website (recommendations, search results) and mail marketing, but that’s about where things end from what I have seen.

      2) “Is the “iterate till it works” process the norm in AI?” - I am no AI-expert but I think that an iteration-validation cycle is basic part of any software development with or without AI/ML. Unless you meant something different here. ML definitely includes validation of the model’s output as a crucial step.

      3) “What’s your experience in using data science/data analysis in marketing? “ - I haven’t had the chance to do something in terms of marketing use. As noted above, Coursera seems to be making some effort, periodically sending out updates about new courses based on previous participation in a course.

      4) “What are the new tools worth considering over the established ones?” - as always with software development, the choice of tools depend on the project requirements. I don’t think one should choose sides. What I find interesting is that we have 4 major companies (Amazon, Microsoft, Google, IBM) with a quite developed portfolio of AI/ML/DS cloud-based tools. In large part these can be swapped around or combined. It’s also very positive to see these companies trying to fill the education gap. Mostly IBM and Amazon are offering free/low-cost online courses that can help more people get into DS and AI. 

      I hope these comments help in any way :)

      Thank you for reading!

    • “Thank you for the invitation, it’s been a long time since I last was here.”

      I suppose that “Welcome back, MarkG!” then is an appropriate response.

      “Our project is in education, trying to improve all aspects of the E-learning process through ML. Doing a research of the current landscape doesn’t show many results. Coursera seems to be more active with some efforts in the website (recommendations, search results) and mail marketing, but that’s about where things end from what I have seen.”

      You may find useful this data science podcast on creating adaptive tests.  Instead of every student being tested on 50 items regardless of ability, the computer-based test instead provides increasing or decreasing in ability subsequent questions after each response until it’s determined your competency.  Some real opportunities for “natural language processing (NLP)” and other machine learning approaches. Hope you find it interesting.

    • Many thanks 🙏 for the link! Adaptive testing is definitely a valuable concept, not only to save time (as mentioned in the podcast), but also in providing variation and reduce cheating (while still being fair towards students being evaluated through different tests). In our project we are also working (using ML) on adaptive learning paths and providing additional reference material as means of accelerating the learning process and improving completion rates.

    • “I haven’t had the chance to do something in terms of marketing use. As noted above, Coursera seems to be making some effort, periodically sending out updates about new courses based on previous participation in a course.”

      For learning Big Data and Marketing, I’ve found that it requires a lot more digging to find useful case studies and education. 

      A recent interview I read with a data scientist at Quora on their A/B testing approach was fascinating: they have hundreds of millions of ML generated views so that each user receives a feed of content that is optimized to their behavior-exhibited preferences.  

      What are “behavior-exhibited preferences?”

      Quite simply, it’s recommendations based on what you do rather than what you say you want.  For example, the most interesting thing I learned about Netflix’s recommender system is that it makes suggestions based on what you actually watch, not what you add to your queue: all those intellectually curious documentaries that have been sitting unwatched in your queue for six months won’t fool Netflix that what you really want is more comedy flicks.

      *********

      Related discussions

    • Quite simply, it’s recommendations based on what you do rather than what you say you want.  For example, the most interesting thing I learned about Netflix’s recommender system is that it makes suggestions based on what you actually watch, not what you add to your queue: all those intellectually curious documentaries that have been sitting unwatched in your queue for six months won’t fool Netflix that what you really want is more comedy flicks.

      This is also described as explicit vs implicit ratings. Explicit means marking something as a 1-5 rating, clicking a like button, etc. Implicit ratings are deduced from monitoring people’s behavior. Illustration below from a quite helpful book on the subject: "Practical Recommender Systems"

    • Hi Stephen, Sorry for the slow response. I was out of the country for most of the last 3 weeks and largely offline during that time. Interest topic to kick around.

      It seems that monitoring our behaviors and extracting predictible patterns, which in turn can be converted into actionable insights that will drive future (buying) decisions is the holy grail of using big-data for marketing purposes. The discussion on explicit and implicit ratings touches on this. But where I think AI/ML/Big Data will largely fall a bit short is at the individual level. People are fickle. Statistically, what I did yesterday may have a strong influence on what I did today, but no AI or Big Data process will be able to monitor all of my behaviors and determine that I might wake up tomorrow and decide "today is the day that I am going to start doing X" or "ok... I'm kind of tired of binge watching netflix" or whatever. Maybe recommendation engines work well for many people, but I have yet to find one that works for me. Because I like one musical artist, does not mean I like some other musical artist that the recommendation engine thinks is closely related, because for me it may not be about "genre" but maybe about something else hidden deeper in the music. But I may just be an oddball.

      Where I see Big Data being really powerful is in sorting out problems that are ridiculously mult-variate. Better understanding how individuals in different populations will respond to different medical treatments, for example. Climate. Or for industrial applications where 2nd or 3rd order effects on individual variables interact to produce significant, but rare, ocurrances which can influence production efficiency, product quality, reliability, and safety. Marketing can fit into this category as well, across large demographics, predicting future trends based on current events, etc.. But I think it may be a fools errand to try to predict the behavior of Jane Smith as if she is nothing other than some algorithm that can be decoded given enough data, the right data scientists, and sufficient server farms.

    • our project is in education, trying to improve all aspects of the E-learning process through ML. Doing a research of the current landscape doesn’t show many results. Coursera seems to be more active with some efforts in the website (recommendations, search results) and mail marketing, but that’s about where things end from what I have seen.

      I thought the below graph was interesting. Manufacturing was generating 4X as much data as education ten years ago. Curious as to whether the ratio is still the same.

      Image credit

    • Although my understanding of data science is very limited, I hold a deep curiosity on how the use of data influences the human connection. My biggest concerns being how data is used by governments and business to influence the will of the users. Although there are countless success stories on the use of data to make our lives better there is a fundamental concern I see with a lack of user data rights and the manipulation of data through media outlets. Even though we have an incredible toolset before us I still believe we are a caveman trying to taste a burning branch struck by lightning. It looks awesome but do we really know what it is capable of?

      This notion of using data as a sword to disrupt industry is beginning to reveal itself a merely a means to monetize interests of users to build other industries off of. The hands wielding the sword grateful for such a favorable environment to sow their influence over users hungry for the carrot of convenience. So what does all of this have to do with your post? My contribution to this panel is to ask If you are to be a steward of the technology consider how the tech will impact the end user and possibly how your contribution will not only lead to the application of it but to possibly understanding the implications of it. Ultimately these technologies are not being built by huge corporations, they are being built by individuals who could eventually move the cheese to use cases favorable to the interests of the users.

      Thank you for inviting me to this panel Stephen. I have enjoyed reading the thread!!

    • But where I think AI/ML/Big Data will largely fall a bit short is at the individual level. People are fickle. Statistically, what I did yesterday may have a strong influence on what I did today, but no AI or Big Data process will be able to monitor all of my behaviors and determine that I might wake up tomorrow and decide "today is the day that I am going to start doing X" or "ok... I'm kind of tired of binge watching netflix" or whatever.

      Have you seen this ⬇️? It’s more than five years old, and I suspect the algorithms have improved significantly since then.

      They may not be able to predict when you’ll buy something, but there seems to be more and more data available to predict what you will buy.