Huawei has unveiled UCM, a toolkit that uses key-value caching to speed up inference, expand context windows, and reduce token processing costs. Already in trial at UnionPay for customer service and marketing, UCM is set to be open-sourced in September.