AI Agents for Customs Documentation at Port Everglades: What Works in Production
A workflow-level look at what AI agents actually do inside a customs filing operation — from commercial invoice to CBP ACE — and where they break.

Photo: Tim Mossholder
It is 7:40 on a Monday morning in Fort Lauderdale. The customs manager at a mid-sized freight forwarder opens her inbox. There are forty-three entries pending for the week. Three consignees have sent rush requests — two for refrigerated shipments out of Santo Domingo that need to clear before their cold chain resets, one for an electronics shipment out of Kingston that the importer wants on the truck by Wednesday. There are two filings held on documentation discrepancies: one where the commercial invoice from a supplier in Tegucigalpa doesn't match the bill of lading, one where the certificate of origin for a CAFTA-DR claim is missing a signature. There is one entry that kicked back from CBP over the weekend because of a misclassified HTS code on a shipment of auto parts from Honduras.
She has seven hours to clear the board before the Tuesday vessel arrivals start generating a new stack.
This is the work. It is rule-bound, multi-step, document-heavy, and unforgiving of errors. It is also the category of work where AI agents have moved from pilot demos to real production deployments in the last eighteen months. This post is about what that actually looks like at the filing level — what an agent does, where it struggles, and how a customs manager should evaluate whether an implementation is genuinely working.
I am not going to tell you AI is transforming logistics. You can read that on any vendor site. I am going to walk you through the work.
What the agent actually ingests
The agent's first job is to consume everything the filing will touch. For a typical Port Everglades inbound entry, that means the commercial invoice, packing list, bill of lading, certificate of origin if applicable, any supplier correspondence that modifies the original order, the HTS schedule for the current date, the Partner Government Agency requirements for the specific commodity, the carrier's ETA data from the terminal operator, and the client's standing preferences — preferred entry type, surety bond details, any blanket declarations on file, FDA or USDA flags if the goods are in a regulated category.
None of this is "reading a PDF." The agent has to extract structured data from sources that range from clean XML messages out of a major exporter's ERP to a scanned commercial invoice that the supplier in Port-au-Prince printed, signed in blue ink, and sent back as a phone photo. Where the documents are clean, the agent produces a filing-ready data set in seconds. Where they are not — and at Port Everglades they are often not — the agent has to make calls about what to do when fields conflict.
Take a real example. A Caribbean exporter ships a container of mixed electronics to a South Florida consolidator. The commercial invoice lists "electronic components" as the description for three line items. There is no HTS suggestion on the invoice. The packing list has more detail — model numbers, specs — but the model numbers don't match the exporter's own catalog because the exporter recently rebranded. The supplier correspondence references the older model numbers. The agent has to reconcile all of this, cross-reference the probable product descriptions against the HTS schedule, look at prior entries from the same consignee to see what classifications the broker accepted in the past, and propose a classification with a confidence score for each line item.
This is not data entry. This is domain-specific reasoning over conflicting inputs. An agent that treats it as data entry — the kind of agent a non-customs vendor ships — produces filings that fail validation at CBP or, worse, pass validation but are wrong.
Where judgment gets codified
The hardest part of customs work has never been the data. It has been the judgment. Whether goods qualify for a free trade agreement and whether filing for it is worth the administrative cost. Whether a classification will trigger Section 301 on Chinese transshipment. Whether an entry should be filed as type 01 or type 06 given the importer's bonded warehouse position. Whether a consolidation improves margin or just shifts complexity. Whether to file a post-entry correction on an error or let it ride.
An AI agent does not replace this judgment. It externalizes it.
Every experienced customs broker carries rules in their head. "For this importer we always file type 01 because their operation is straightforward. For that importer we file type 06 because their bonded strategy changes the math. For this supplier we always request the CAFTA certificate upfront because they are slow to produce it after the entry is filed. For that consignee we never classify anything under 8517.62 without a second review because they had a Section 301 issue in Q3 of last year."
When those rules live only in the broker's head, the firm has three problems. The rules are not consistently applied. The rules cannot scale beyond the broker's own throughput. The rules disappear when the broker leaves.
An agent implementation turns those rules into written policy the agent applies. The implementation process is, in large part, a knowledge extraction exercise — sitting with your senior brokers and getting the tribal knowledge documented. The agent then applies it consistently across every filing for that importer or that supplier or that commodity.
The firm's accumulated expertise stops being tribal and starts being codified. That is a bigger change to the business than the time savings on data entry.
One concrete example of what this looks like in operation. An entry comes in for a shipment out of San Pedro Sula headed to a consignee in Miami-Dade. The agent detects that the goods qualify for CAFTA-DR treatment based on the country-of-origin documentation from the exporter. It calculates the duty savings against filing without the preference — in this case roughly 4.8% of the invoice value. It flags that the CAFTA certificate has to be retained on file for five years per CBP regulation. It checks the importer's document repository for the certificate, finds it, attaches it to the entry package, and queues the filing for broker review. The broker sees a complete CAFTA-DR claim ready for submission and either approves it or, if something looks off, overrides it.
The broker's judgment stays with the broker. The preparation work — which is 80% of the clock time — is handled.
The CBP ACE interface problem
One thing the AI vendors in this space often blur, and which every customs manager should be clear on: AI agents do not file directly with CBP.
CBP's Automated Commercial Environment accepts filings through three channels. The ACE Portal, which is the manual web interface. The ACE Secure Data Portal, for structured submissions of specific document types. And ABI — the Automated Broker Interface — which is the EDI message set that licensed brokers use for essentially all production filings.
ABI is a legacy system. It operates on filer codes issued to licensed customs brokers. The software ecosystem that talks to ABI is mature — CargoWise, Descartes, Thomson Reuters ONESOURCE, and a handful of proprietary filers most mid-sized brokerages have built or customized over the last twenty years. These are the systems that actually submit to CBP.
Where does the agent fit? Upstream of the filer. The agent's job is to produce filing data — complete, validated, formatted — and hand it off to the broker's existing ABI software. The broker reviews in their usual interface. The broker submits. The agent does not touch ABI directly.
This distinction matters for three reasons.
First, it keeps the licensed broker in the regulatory chain. A CBP filing requires a licensed broker of record. The agent is a preparation tool. The broker is the filer. This is not a technicality — it is a legal structure that protects both the firm and the client.
Second, it keeps the investment in your existing filer system. If you have built custom logic in CargoWise over the last decade, you do not throw it away. The agent produces filings that drop into your existing workflow. Your ABI system keeps doing what it does.
Third, it clarifies responsibility when a filing fails. If the agent produced bad data, the failure is upstream of the broker. If the broker's filer system failed, the failure is downstream of the agent. The interface between them is where accountability lives.
The same pattern applies to PGA message sets — the electronic filings for FDA prior notice on food imports, USDA APHIS on agricultural products, EPA on regulated chemicals. The agent prepares the PGA message. The filer submits it. The agent's responsibility ends at the point the broker reviews and approves.
What actually goes wrong
Vendors do not write about failure modes. Customs managers do. Here are the ones that show up in real production deployments, in rough order of frequency.
Incomplete documentation. The agent can work with missing fields, but it has to escalate when the missing field is material to classification. A commercial invoice that lists "parts" without a product description leaves the agent with nothing meaningful to classify against. The correct behavior is to flag the entry as requiring human input and wait. Silent guessing is a failure mode. Escalation is not.
Classification edge cases. Products that legitimately sit between two HTS codes. This is not a failure of the agent. It is a failure of the product to be cleanly classifiable. A good agent surfaces the ambiguity with the supporting case for each option and asks the broker to pick. A bad agent picks autonomously based on whichever option the training data saw more often.
Supplier inconsistency. When the same exporter files commercial invoices with wildly different formatting month to month — or when a new exporter comes online without a history — the agent has to learn the supplier's patterns over several entries before its accuracy improves. This is where "supplier profile maturation" matters. Early entries from a new supplier should have a higher human-review rate than entries from a supplier with two years of history.
Regulatory changes mid-filing. Section 301 tariff lists change. FDA food category rules change. CAFTA-DR rules of origin are periodically revised. An agent working off stale reference data produces filings that are factually wrong. The reference data pipeline — how current the agent's view of the HTS, the tariff lists, and the PGA requirements is — is as important as the agent itself. Ask any vendor how often their reference data refreshes and how quickly their agents pick up changes. If they cannot answer, they are not running in production.
Client-specific preferences that are not documented. The tribal knowledge problem in reverse. If the senior broker never wrote down "we always file this client as entry type 09," the agent will not know, and the junior broker reviewing the agent's output may not catch it either. Implementation is a knowledge-extraction exercise. Skipping that step means the agent is trained on the general pattern and misses the client-specific overrides.
CBP's own inconsistency. Different ports have different inspection patterns. Different officers interpret regulations differently. Some entries get held for CBP's own reasons that are not predictable from the filing content. The agent reports these as ambiguous downstream states, not as its own failures. This is a different category of problem than agent performance and should not be measured that way.
How to evaluate whether an implementation is actually working
If you deploy an agent and do not measure it, you will not know whether it is helping you or quietly producing filings that will fail CBP audit in eighteen months. Here are the metrics that matter.
First-pass validation rate at ACE. What percentage of the agent's prepared filings validate on the first broker submission to ABI, without corrections? A human-prepared filing baseline at a well-run brokerage is typically 92 to 96 percent. A well-implemented agent should be at 98 or higher. If the agent's filings are validating at or below the human baseline, the deployment is underwater.
Classification accuracy on manual review. What percentage of the agent's proposed HTS codes survive the broker's review without modification? Target is 95 percent or higher on matured supplier profiles, somewhat lower on new suppliers in the profile-maturation phase. If the broker is modifying more than one classification in ten on established suppliers, the agent's reference data or training is off.
Escalation rate. What percentage of entries get flagged for human review instead of handled autonomously? Early deployments run 40 to 60 percent escalation as the agent learns the firm's conventions. Mature deployments run 15 to 25 percent. Below 10 percent is suspicious — the agent is probably guessing in cases where it should be asking. Above 30 percent in a mature deployment means the agent is not learning from the review feedback.
Time-to-entry. Median time from receipt of the commercial invoice to a broker-ready filing package. In a working deployment this number drops 60 to 80 percent. If it does not, the agent is either too slow at preparation or too conservative with escalations.
Filing error rate post-submission. Percentage of filings that kick back from CBP after submission — for classification errors, documentation errors, or PGA rejections. This should be materially below the pre-agent baseline. If it is not, the agent is shifting work around rather than improving it.
If your customs operation is not tracking these numbers, you do not have a measurement of the AI agent's performance. You have a vibe.
Who should still be worried
A few categories of customs operation where I would be more cautious about deploying an agent on any near-term timeline.
Operations that file heavily into high-ambiguity commodity categories — apparel with complex quota interactions, dual-use goods with BIS export implications, controlled substances under DEA oversight. The agent can do the preparation work but the judgment surface is so large that the review overhead approaches the original clock time. The ROI is smaller.
Operations that file for importers who change their sourcing patterns constantly. The supplier profile maturation does not get the repetition it needs. Every entry is effectively a new supplier.
Operations that are already running lean and do not have senior broker time to invest in the knowledge-extraction phase of implementation. An agent rolled out without the broker's accumulated rules extracted and codified is operating without the firm's institutional knowledge — generating output that looks complete but misses the specifics that matter. It is worse than manual filing, because it introduces false confidence.
Operations where the licensed broker of record is not bought in. The broker is the filer. If the broker is suspicious of the agent's output and reviews every line of every filing, you have not saved any time and you have added a cost.
Knowing whether your operation fits any of these categories is part of the scoping work we do before we recommend deployment.
What changes for the customs manager
The customs manager I described at the top — the one with forty-three entries on Monday morning — has the same fundamental job after an agent is in production. The profession is customs. It rewards precision and punishes carelessness. None of that changes.
What changes is who carries the cognitive load for the rule-bound portion. The customs manager who used to spend 70 percent of her time on documentation preparation and 30 percent on the hard judgment calls now inverts that ratio. The hard calls are still hers. The firm runs leaner. The margin improves. Her Monday morning inbox has forty-three entries in it, and forty of them are already broker-ready when she arrives.
That is what production deployment of AI agents in customs documentation actually looks like. Not transformation. Not disruption. Reallocation of cognitive load from rule-bound work to judgment work, in the shop where that reallocation was always the bottleneck.
If you run a customs operation at Port Everglades or anywhere in the tri-county and you are evaluating what this would look like in your specific filing flow, the service page for marine and logistics has the engagement structure. The underlying framework is in our foundational piece on agentic AI for South Florida logistics. The n8n security analysis covers the security considerations that come with any agent given access to filing credentials.
Get more like this.
One monthly email. Substantive thinking on agentic AI for operationally complex businesses.
Ready to explore AI for your business?
Book a 30-minute AI Transformation Assessment. Mapped to your operations, modeled against your P&L.
Prior Auths, Portals, and Premium IOLs: Where Ophthalmology Practices Leak the Most Time
A four-physician ophthalmology practice processes 80 to 140 prior authorizations a week — and that's one workflow of five. Where South Florida practices actually leak hours, and what agentic AI does about it.
Customs, Containers, and Cognitive Load: Where AI Agents Fit in South Florida Logistics
Port Everglades and the Port of Miami move more freight each week than most logistics operators can scale to handle. Here's where agentic AI meaningfully cuts overhead — and where it doesn't.
