MorphL Community Edition – From Prototype to Open-Source Platform (Alpha Release)

At the beginning of this year, MorphL started through a grant provided by the Google Digital News Initiative Fund. Today, we are proud to announce the release of the MorphL Community Edition (alpha version). Below you can read the story of MorphL – how we got from a paper prototype to a functional platform.

The Vision

For a long time, we have been interested in Data-Driven Product Development. When it comes to data, we tend to think in dichotomies: quantitative vs qualitative, objective and subjective, messy and curated, science and story. But using data for product development does not have to be an either/or.

We all know that the product development process usually consists of a few important phases: planning, design, development and launch. To increase engagement and conversion rates, this process undergoes multiple iterations, which developers seldom navigate by looking at the data. Usually there’s somebody else, be it a product owner, marketing or sales person, analyzing it and feeding developers a feature list needed for the next product release. There lies the gap between developers, marketers and users which usually leads to lots of guess-work.

That’s why organizations have started developing metrics that go beyond the default off-the-shelf analytics provider. Rather than relying on single signals – such as the number of page views or the number of clicks to determine conversion — the trend is now towards using Machine Learning that draws insights from multiple signals or event streams to provide a personalized user experience.

The vision behind MorphL is to use machine learning to predict user behavior in mobile & web applications and enable personalized user experiences with the end goal of increasing engagement & conversion rates.

How Does MorphL Work?

At a very high level, the MorphL platform consists of various components.

  • We integrate with various data sources (Google Analytics, HotJar, Mixpanel, Kissmetrics, etc.) to identify user behaviors.
  • We develop machine learning algorithms to build predictive models.
  • And then developers can programmatically consume these models to build personalized user experiences in mobile or web applications.

This enables businesses to be proactive instead of being reactive: meaning that you can offer various incentives (discounts, vouchers, etc.) before a user actually churns, abandons a purchase and so on.

How Was MorphL Developed?

Our first goal was to come up with a process that would enable us to create successful proofs of concept.

Since we needed data that we could feed into our predictive models, we worked together with multiple publishing and e-commerce websites, the first one on the list being PressOne, our Google DNI partner.

The data gathering process was slow and painful. The free version of Google Analytics (which is the analytics platform of choice for many of our data partners) offers access only to aggregated data, so we had to come up with our own setup process for exporting granular data. Unfortunately this process does not apply retroactively – so we had to wait for data to collect before moving on to creating proofs of concept.

Once we had enough data, we moved on to the features analysis and engineering stage. Our previous experience in developing web applications became very important because it enabled us to be pro-active and understand how we could best use the data.

We started with the following usecases:

  • Predicting churning users, for Publishers
  • Predicting user’s shopping stage, especially for eCommerce
  • Predicting user’s next action, for Publishers
  • Recommender systems, both for Publishing or eCommerce businesses
  • User Intent Prediction, especially for eCommerce businesses

Each of these usecases is validated through a successful proof of concept before being integrated into the MorphL platform.

MorphL Platform Architecture

The MorphL Platform consists of two main components:

  • MorphL Platform Orchestrator – This is the backbone of the platform. It sets up the infrastructure required for running pipelines for each model.
  • MorphL Pipelines – Consists of various Python scripts, required for retrieving data from various sources, pre-processing, training a model and generating predictions.

While our Machine Learning engineers were creating proofs of concept, our Big Data engineers were busy working on a scalable platform that can be used in production. The infrastructure setup was centralized into the Orchestrator, which now consists of 3 pipelines:

  • Ingestion Pipeline – It runs a series of connectors responsible for gathering data from various APIs (Google Analytics, Mixpanel, etc.) and save it into Cassandra tables.
  • Training Pipeline – Consists of pre-processors (responsible for cleaning, formatting, deduplicating, normalizing and transforming data) and model training.
  • Prediction Pipeline – It generates predictions based on the model that was trained. It is triggered at the final step of the ingestion pipeline through a preflight check.

Our first usecase (predicting churning users for publishers) is already integrated in the platform and available on Github.

What’s Next for MorphL?

Our plan is to extend the platform and add pipelines for all usecases described above. At the same time, we are searching for new partners, data sources APIs and working on improving existing models.

If you would like to become a data partner or find out more about how MorphL can help your business, don’t hesitate to contact us!

Leave a Reply

Your email address will not be published. Required fields are marked *