Build AI data lakes to optimize data-driven collaboration, safeguard planning and forecasting, and drive supply chain cost efficiencies that protect profits and grow your income
Imagine getting thousands of tables and 300+ different spread sheets. Each tab for each workbook has different data and some of it does not have columns. There are no column headers. You can dump all of this data into a large language model (LLM) and it should figure out what something is. But first you have to train your LLM.
This process is feature extraction and it’s used when deploying AI in manufacturing supply chains.
When manufacturers with extended contract manufacturing supply chains like Dell, General Motors or Samsung contract services like Snowflake, Databricks, Azure Synapse or BigQuery…it’s a multi-million dollar project.
This is because ETL data cleansing – which is the process of extracting, transforming and loading (ETL) combined data from multiple sources into a large, central repository called a data warehouse or data lake to ensure manufacturers use only quality and relevant data – can last forever.
Bayesian networks and data processing
ETL data processing is simple but good ETL requires domain expertise for accurate classification. (e.g., subject matter experts) Selecting the right people is important and if the right process is in place ETL projects don’t have to last forever.
In most large manufacturing enterprises ETL can be performed in four or five meetings of up to 30 people. It should also be mandated that every person identified to attend these ETL meetings must attend each meeting.
Initially, every person is assigned a workbook with tabs for them to label columns and if a person cannot determine what a column should be labeled there is always someone else in the meeting who can. ETL project management in this manner can save months and months of incorrect labeling.
RELATED
AI enhances EMS-OEM electronic manufacturing opportunities
But most large manufacturers have commodity sourcing and supply chain category managers located all over the world and bringing everyone physically together at one location is not suitable. So solving ETL challenges within a project’s timeframe often times requires building a fortified model that can figure out what all of the data is and how it should be labeled. Good artificial intelligence practitioners with the right subject matter experts can build good models with very limited information.