> ## Documentation Index
> Fetch the complete documentation index at: https://docs.merchantops.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# The Lakehouse

> A shared knowledge base of product, technology, and document data that enrichment draws on.

The Lakehouse is a shared knowledge base that
[enrichment](/enrichment/how-it-works) reads from when filling in your products.
It holds structured product data, brand technologies, and the source documents
those were extracted from. Enrichment consults the Lakehouse before scraping the
open web, so data your organization has already contributed — or that is shared
across organizations — can complete a product without another round trip to a
brand site.

The Lakehouse is a *source*, not a destination for your catalog: you don't add
sellable [products](/catalog/products-variants) to it directly. You contribute
to it by uploading documents, and it feeds enrichment.

## What the Lakehouse holds

<ResponseField name="Lakehouse products" type="extracted product data">
  Structured product records pulled from uploaded documents — names, brands, and
  attributes. Enrichment matches these against your catalog products to fill in
  missing values.
</ResponseField>

<ResponseField name="Technologies" type="brand technologies">
  Named brand technologies and features (materials, cushioning systems, and the
  like), with the source document they came from. These enrich the products that
  reference them.
</ResponseField>

<ResponseField name="Documents" type="the source files">
  The uploaded files — spec sheets, catalogs, and similar — that products and
  technologies are extracted from. Each document keeps a link back to what was
  derived from it.
</ResponseField>

## Documents and extraction

You populate the Lakehouse by uploading documents. Extraction then reads each
file and proposes the products and technologies it found.

<Steps>
  <Step title="Upload the file">
    Upload one or more documents and tag them with the brand and category they
    describe. Each upload is tracked as a [Job](/jobs/overview).
  </Step>

  <Step title="Extraction runs in the background">
    MerchantOps reads each document and extracts candidate products and
    technologies, each with a confidence score, running as a tracked Job.
  </Step>

  <Step title="You review the column mapping">
    For spreadsheet documents, MerchantOps proposes how each column maps to a
    known field and shows you a preview. You confirm, adjust, or reject the
    mapping before the extracted data is accepted — a human-in-the-loop review
    step, distinct from the automatic column mapping used when
    [uploading products](/data-ingestion/uploading-products).
  </Step>

  <Step title="Extracted data enters the Lakehouse">
    Once approved, the extracted products and technologies become available to
    enrichment. Low-confidence extractions can be flagged for review.
  </Step>
</Steps>

<Note>
  Documents themselves always belong to the organization that uploaded them.
  Whether the *data extracted* from them is shared more broadly is controlled by
  your sharing settings, below.
</Note>

## Per-org sharing

The Lakehouse can be shared across organizations, and what you share is
configurable. By default, product and technology data is shared, while MAP
pricing data is kept private. Changing a sharing setting affects **future**
writes only — data already contributed keeps the sharing decision it was written
with.

Manage this at [Lakehouse sharing settings](/settings/lakehouse-sharing).

<CardGroup cols={2}>
  <Card title="How enrichment works" icon="wand-magic-sparkles" href="/enrichment/how-it-works">
    How enrichment consults the Lakehouse before the open web.
  </Card>

  <Card title="MAP policies" icon="scale-balanced" href="/pricing/map-policies">
    The pricing data the Lakehouse can hold, kept private by default.
  </Card>

  <Card title="Sharing settings" icon="sliders" href="/settings/lakehouse-sharing">
    Choose what your organization shares into the Lakehouse.
  </Card>

  <Card title="Uploading products" icon="file-csv" href="/data-ingestion/uploading-products">
    Adding sellable products to your catalog, as opposed to the Lakehouse.
  </Card>
</CardGroup>
