envIA. How to send Christmas greetings using artificial intelligence.
A couple of weeks ago we ran into the typical problem in these days: how to send Christmas greetings in an original way. But we wanted to find a way that was in line with what we are: an agile company where creativity and innovation go hand in hand with technology. And given that we are a bit daring, and enjoy challenges, we decided to build an artificial intelligence tool to craft Christmas greetings. As a result, and in collaboration with Prodigioso Volcán (because we wanted the communication to be as good as the artificial intelligence underneath), envIA is born.
envIA: craft your own smart postcard is a creative generator relying on the latest advances in artificial intelligence and natural language processing. Given an image and a text, it is able to craft a greeting postcard that adapts both image and text so that it is better suited for the occasion.
At the high level, it is built on top of two modules:
Greeting poem generation
When the user introduces a text, we process it with GPT3 to generate a poem that is unique, personalized, and surprisingly coherent. Of course, if the text introduced is related to the image, then the whole postcard will be more coherent, given that the text generated will also be in line with the image.
Image style adaptation
Adapting the style of an image implied a larger challenge. Most likely, we have all recently seen impacting images generated by means of Stable Diffusion or Midjourney. Even though the results are really appealing, the major drawback of these systems is that, out of the box, they typically have problems respecting specific parts on an image. For instance, this was the result we obtained when attempting to use a diffusion model (a model that generates a new image from a given one, such as Stable Diffusion) to make the picture more “Christmas-y”. Given this image:
this was the result we obtained:
Sure, the resulting image is very Christmas-y… but the people in the picture are now other people, which does not seem to make much sense for our purpose. To avoid this, we added a step before the adaptation, which implies detecting and segmenting the people in the picture, relying on an image segmentation model (named MaskFormer). In this way, we first build a mask that is used in combination with the original image to tell the diffusion model which parts of the image it should be modifying, and which parts should not be changed, reaching a much more acceptable result:
We also had some important challenges when attempting to scale the model so that it can be used by many users at the same time, but we’ll leave that for another post.
Once we had both building blocks required to craft a Christmas postcard (poem and image), we then only needed… everything which is not related to artificial intelligence, but is key to allow an artificial intelligence system to be useful. This is to say, building a web app that allows the user to interact with the model, and then generates a final arrangement which is appealing. To this end, we were lucky to have the help of our friends at Prodigioso Volcán, who embraced the idea with great enthusiasm as soon as they knew about it.
And that’s how envIA: craft your own smart postcard was born in less than two weeks.
Other use cases
Artificial intelligence technology is a transverse technology, which can be used to generate Christmas postcards, or for many other use cases. In this case, the technology we have used to adapt the style of a picture and generate a Christmas greeting could also be used for other use cases, such as:
- In the retail domain, can you imagine the number of pictures that need to be taken to have the image of a garment in its different colors? Well, this kind of technology can help alleviate that cost.
- When putting garments up for sale at an e-commerce store, post-edition of pictures is a very manual and costly process. The goal of this process is to give the image a more appealing aspect, without losing its natural look. For instance, this process entailed a bottleneck at Micolet, which was hindering their growth. We deployed technology to remove this bottleneck, effectively allowing them to continue growing. Taking into account how fast technology has evolved in the latest months, automation potential has increased significantly. In fact, we’re already preparing another surprise around this topic. Stay tuned!
- Let’s now think of a different vertical, real estate. Even though real estate property is typically sold unfurnished, seeing the rooms furnished is key to understanding the size and potential of each room.
And these are only some of the potential use cases that arise when dealing with a process, in that they attempt to automate a specific manual process… but the possibilities from a more creative stand point are much wider: for instance, to serve as inspiration for designers. We elaborate on this use case here.
We can also mention other use cases of the natural language processing technology we have been using, such as:
- When generating content (such as this article), it is often the case that text is generated from a small set of ideas. The type of NLP model we have been using to generate poems could also be used to, say, generate a full text from a set of items, and later forward it to a professional copywriter to edit it.
- Personalization is typically seen as the Holy Grail of marketing. Imagine a system that is able to personalize and adapt the messages sent via social media, or via email, to the style of the person that will be receiving the message. This type of adaptation, which would be way too costly to do manually, could be automated using NLP technology similar to the one we have been using to generate poems.
What are your thoughts?
Get in touch with us at email@example.com if you want to know more about the potential use cases of this technology!