Alisa Davidson
Printed: August 19, 2025 at 9:20 am Up to date: August 19, 2025 at 9:20 am
Edited and fact-checked:
August 19, 2025 at 9:20 am
In Temporary
Alibaba Cloud’s Qwen crew has launched Qwen-Picture-Edit, a state-of-the-art picture modifying mannequin that mixes semantic and look modifying with exact bilingual textual content modification, delivering superior capabilities for inventive and sensible functions.

Alibaba Cloud’s Qwen crew has launched Qwen-Picture-Edit, a sophisticated picture modifying mannequin derived from the 20B Qwen-Picture framework. The brand new system expands upon Qwen-Picture’s distinct textual content rendering capabilities by making use of them to picture modifying, with a specific give attention to precision in textual content modifications. Qwen-Picture-Edit processes enter photographs via two parallel elements: Qwen2.5-VL, which manages visible semantic management, and the VAE Encoder, which governs visible look. This twin strategy allows the mannequin to deal with each semantic-level and appearance-level modifying duties successfully. The device is accessible via Qwen Chat below the “Picture Enhancing” characteristic.
Qwen-Picture-Edit is designed to carry out throughout a number of modifying dimensions. It helps each appearance-level changes, such because the addition, elimination, or modification of visible components whereas conserving all different areas of the picture intact, and semantic-level edits, reminiscent of mental property creation, object rotation, or fashion transfers, the place broader pixel alterations are permitted however semantic integrity stays preserved. It additionally gives refined textual content modifying capabilities in each Chinese language and English, permitting customers so as to add, take away, or regulate textual content inside photographs whereas sustaining font, measurement, and magnificence consistency. Benchmark testing throughout a number of widely known datasets signifies that Qwen-Picture-Edit reaches state-of-the-art efficiency in picture modifying, positioning it as a powerful basis mannequin for future functions on this area.
Qwen-Picture-Edit’s Semantic And Look Enhancing For Artistic And Sensible Functions
One of many defining elements of Qwen-Picture-Edit is its superior performance in each semantic and look modifying. Semantic modifying includes altering the content material of a picture whereas making certain that the underlying visible which means stays intact. As an instance this operate in an easy means, the event crew highlights its use with Qwen’s official mascot, the Capybara, as a sensible instance.

Remark reveals that whereas the vast majority of pixels within the modified picture differ from these within the unique enter picture on the left, the general consistency of the Capybara character stays absolutely maintained. This demonstrates the sturdy semantic modifying functionality of Qwen-Picture-Edit, which helps versatile and diverse improvement of unique mental property content material. As well as, inside Qwen Chat, a devoted set of modifying prompts was created across the 16 MBTI character sorts. Utilizing these prompts, an entire assortment of MBTI-themed emoji packs that includes the Capybara mascot was efficiently produced, successfully extending each the illustration and visibility of the character.
Furthermore, novel view synthesis represents one other necessary use case inside semantic modifying. Qwen-Picture-Edit is able to rotating objects by 90 levels or executing a full 180-degree rotation, enabling direct visualization of an object’s rear aspect. An additional instance of semantic modifying lies in fashion switch, the place, as an illustration, an ordinary portrait will be reinterpreted into a number of inventive aesthetics, together with kinds paying homage to Studio Ghibli.
Alongside semantic modifying, look modifying constitutes a regularly required operate in picture modification. This strategy focuses on preserving particular areas of a picture totally unchanged whereas introducing, eradicating, or altering designated components. As demonstrated in an instance the place a signboard is seamlessly included right into a scene, look modifying lends itself to a broad array of functions reminiscent of background changes for people or modifications of clothes. One other defining functionality of Qwen-Picture-Edit is its precision in textual content modifying, a characteristic derived from Qwen-Picture’s superior experience in textual content rendering applied sciences.
Disclaimer
In step with the Belief Undertaking tips, please notice that the data supplied on this web page just isn’t meant to be and shouldn’t be interpreted as authorized, tax, funding, monetary, or some other type of recommendation. It is very important solely make investments what you’ll be able to afford to lose and to hunt unbiased monetary recommendation you probably have any doubts. For additional data, we recommend referring to the phrases and situations in addition to the assistance and help pages supplied by the issuer or advertiser. MetaversePost is dedicated to correct, unbiased reporting, however market situations are topic to alter with out discover.
About The Creator
Alisa, a devoted journalist on the MPost, focuses on cryptocurrency, zero-knowledge proofs, investments, and the expansive realm of Web3. With a eager eye for rising tendencies and applied sciences, she delivers complete protection to tell and have interaction readers within the ever-evolving panorama of digital finance.
Extra articles

Alisa Davidson

Alisa, a devoted journalist on the MPost, focuses on cryptocurrency, zero-knowledge proofs, investments, and the expansive realm of Web3. With a eager eye for rising tendencies and applied sciences, she delivers complete protection to tell and have interaction readers within the ever-evolving panorama of digital finance.

