Aspect Ratio

One notable distinction between DALL-E and its competitor, Midjourney, is the pliability in controlling the aspect ratio of the generated images. Unlike Midjourney, which offers users the power to specify the specified aspect ratio, thus catering to specific size requirements for various applications, DALL-E lacks this feature. This limitation in DALL-E may be particularly difficult when the duty at hand demands images of a selected dimension. For instance, designers or content creators often require images that fit certain size criteria for web layouts, print media, or social media platforms. Midjourney’s capability to tailor the aspect ratio makes it a more versatile tool in such scenarios, providing users with a big level of control over the output, ensuring that the generated images align precisely with their specific project needs. The absence of this feature in DALL-E, alternatively, can necessitate additional steps for users, like cropping or resizing the photographs externally, which could compromise the unique quality or composition of the AI-generated artwork.

Complexity of Text and Positioning

In the realm of AI-generated imagery, each DALL-E and Midjourney display a various degree of proficiency in text generation, especially when comparing common phrases to more area of interest or specialized ones. For instance, generating well known phrases like “Happy Birthday” tends to be more successful for each platforms, likely as a consequence of the prevalence of such phrases of their training datasets. However, relating to less common phrases, akin to “2023 in AI”, the outcomes may be less reliable. The models may struggle to know and appropriately place less incessantly encountered terms inside an appropriate context. Moreover, relating to the position of text inside images, Midjourney shows a selected limitation. Unlike DALL-E, which generally manages to integrate text more seamlessly into the visual narrative, Midjourney often falters in accurately positioning text. This discrepancy may be crucial for projects where the spatial arrangement of text is as necessary as its content, underscoring the necessity for continued advancements in AI’s understanding of the intricate relationship between textual and visual elements.

In the next examples, DALL-E tends to get the spelling and positioning of the text more right than Midjourney 6, but each are still in dire need of improvement before the image may be used “in production”. One necessary caveat is that inpainting with AI allows for straightforward correction of errors.

This article was originally published at www.artificial-intelligence.blog