Wow – just upload your data sets, link them together and the coffee shop owner is on their way to predicting how much coffee they need to order based on the projected number of customers.
According to Dip Ranjan Chatterjee, The system instantly identifies which models and pre-processing steps it should or shouldn’t run on certain tasks, based on various encoded rules. It first chooses from a large list of those possible machine-learning pipelines and runs simulations on the sample set. In doing so, it remembers results and refines its selection. After delivering fast approximated results, the system refines the results in the back end. But the final numbers are usually very close to the first approximation.
According to TechCrunch:
One example the researchers provide is that doctors could make use of the system to make predictions about the likelihood their patients have of contracting specific diseases based on their medial history. Or, they suggest, a business owner could use their historical sales data to develop more accurate forecasts, quickly and without a ton of manual analytics work.
According to MIT:
As part of the Northstar project, we envision a completely new approach to conducting exploratory analytics. We speculate that soon many conference rooms will be equipped with an interactive whiteboard, like the Microsoft Surface Hub. Data scientists and domain experts can use the whiteboards to avoid the usual week-long, back-and-forth interactions. Instead, we believe that the two can work together during a single meeting using an interactive whiteboard to visualize, transform and analyze even most complex data on the spot. This setting will undoubtedly help the domain experts to quickly arrive at an initial solution, which can be further refined offline. Our hypothesis is that we can make data exploration much easier for laymen while automatically protecting them from many common errors. Furthermore, we hypothesize that we can develop an interactive data exploration system that provides meaningful results in sub-seconds even for complex ML pipelines over very large datasets. The techniques will not only make machine learning more accessible to a broader range of users, but also ultimately enable more discoveries compared to any batch-driven approach.
Northstar includes four main components:
- Vizdom: a novel visual data exploration environment specifically designed for pen and touch interfaces, such as the Microsoft Surface Hub.
- IDEA: an intelligent cache and streaming approximation engine, which enables users to analyze data and create ML pipelines with immediate feedback over any type of data source and independent of the data size.
- QUDE, which monitors every interaction the user does and tries to warn about common mistakes and problems.
- Alpine Meadow: a ”query” optimizer for machine learning that allows users to declaratively indicate what they want (e.g., “predict label X”) while the system automatically figures out the best ML pipeline (i.e., plan) to achieve that goal.
Northstar embodies the Future of Work – it is everything that excites us about bringing the power of AI into the workplace, so anyone can use it. We just can’t wait for it to be at the coffee shop near you.