Text-to-Image Generator: StyleScribe uses advanced GAN models to convert textual descriptions into realistic images of fashion products.
Large Dataset: The project utilizes a meticulously curated dataset of 60,000 unique fashion product images and their corresponding descriptions to train the model.
NLP Integration: Extensive preprocessing of textual data ensures semantic coherence and relevance, enhancing the quality of generated images.
Custom GAN Architecture: A tailored GAN model, comprising a generator and discriminator, is designed to produce high-fidelity fashion images.
Scalable Infrastructure: The project is hosted on Google Cloud RDP and uses Flask for the web interface and Firebase for real-time database management.
Frontend:
● Developed using Flask, the frontend has intuitive web interface for users to input text descriptions and view generated images.
Backend:
● The backend is powered by a Flask server, which handles incoming requests and communicates with the model server.
Model Server:
● Hosted on Google Cloud RDP, the model server processes text inputs through the GAN model and generates images.
Database:
● Firebase Firestore/Realtime Database stores metadata and text descriptions, ensuring real-time data synchronization.
Storage:
● Google Cloud Storage or Firebase Storage is used to store the training dataset and generated images.
- Textual Input Interpretation: The algorithm processes user input, extracting relevant information using natural language processing (NLP). Whether it’s a simple phrase or a detailed description, the NLP techniques identify color, style, length, and design elements.
- Neural Network for Text-to-Image Conversion:
● Generator: This component creates an initial image representation based on the textual input. It translates extracted details (e.g., color, shape) into a visual concept.
● Discriminator: Evaluates the quality and realism of generated images by comparing them to a real fashion image dataset. The feedback loop refines the generator’s output over time. - Training the AI Model: The neural network trains on a diverse fashion image dataset, enabling it to generate a wide range of design concepts.
- Feedback Loop and Improvement: As users interact with the platform, the model learns from successes and mistakes, continuously improving its designs.
- User Customization: Users can further customize designs, specifying color choices, patterns, and fabric textures.
- Output Generation: The algorithm produces high-resolution images reflecting the user’s fashion concept.
Actors | Description |
---|---|
User | Users act as creators by providing textual descriptions of their fashion concepts, customizing generated designs, and offering feedback. They drive the creative process, shaping the AI-generated fashaion designs to align with their vision. |
Administrator | Administrator can make changes to the application by modifying or adding new models.An extension of the administrator’s role is that he can also Add new features if required. |
The ER Diagram outlines the backend process of an application, including input processing and activity flow. When the initial screen appears, functions are executed based on user actions. The input prompt plays a crucial role in this execution.
The Activity Diagram outlines the stages in the application process. It begins with opening the application and focusing on the input field for manual text entry. Users can then interact with the displayed image, either closing the application or using additional features based on text prompts.
In DFD the user queries for information. The Administrator has extra privileges like changing the function and model. The system then creates the desired output and displays it on screen.
The project utilizes a meticulously curated dataset of 60,000 unique fashion product images and their corresponding descriptions to train the model.This images are lebelled with multiple attributes which provides its description in the form of metadata.This Metadata is embedded into term frequency-inverse document frequency.