HumanEdit: A New High-Quality Instruction-Based Image Editing Dataset

HumanEdit: A New Approach for High-Quality, Instruction-Based Image Editing

Instruction-based image editing has made significant progress in recent years. By entering text instructions, users can manipulate images precisely and in various ways. A crucial factor for the success of this technology is the quality of the underlying datasets. A new dataset called HumanEdit sets new standards in this area.

The Challenge of Human Preferences

Previous large image editing datasets are often based on minimal human feedback. This leads to difficulties in adapting the datasets to human preferences. Often, the results of AI-based image editing deviate from the desired results because the training data is not sufficiently aligned with human quality criteria.

HumanEdit: A Dataset with Human Refinement

HumanEdit closes this gap by having human annotators create data pairs and administrators provide feedback. In a multi-stage process that required over 2,500 hours of human work, 5,751 images were carefully curated. This effort ensures both accuracy and reliability for a wide range of image editing tasks.

Six Categories for Diverse Editing Possibilities

The dataset includes six different types of editing instructions:

- Action (e.g., "Rotate the image") - Add (e.g., "Add a tree") - Count (e.g., "How many cars are in the image?") - Relation (e.g., "Place the ball in front of the dog") - Remove (e.g., "Remove the background") - Replace (e.g., "Replace the sky with a sunset")

These categories cover a wide range of real-world scenarios and enable a comprehensive evaluation of AI models for image editing.

Masks and High-Resolution Content

All images in the dataset are provided with masks that precisely define the edited areas. For a portion of the data, it was ensured that the instructions are detailed enough to allow editing without masks. HumanEdit also offers high diversity and high-resolution content (1024 x 1024 pixels) from various domains, making it a versatile benchmark for instruction-based image editing datasets.

A New Standard for Image Editing

HumanEdit aims to advance research in the field of image editing and establish new evaluation benchmarks. By incorporating human feedback and providing high-quality, diverse data, HumanEdit enables the development of more precise and reliable AI models for image editing. For companies like Mindverse, which develop AI-powered content tools, HumanEdit offers a valuable resource for the development and optimization of image editing features. The combination of Mindverse's text, image, and research capabilities allows HumanEdit to be optimally integrated into customized solutions such as chatbots, voicebots, AI search engines, and knowledge systems.

Outlook

The development of HumanEdit is an important step towards more user-friendly and efficient image editing. By focusing on human preferences and providing high-quality data, the gap between the possibilities of AI and the needs of users is further closed. It remains to be seen how research will utilize this new dataset to further push the boundaries of instruction-based image editing. The release of HumanEdit on Hugging Face allows the community to actively participate in the further development of this technology and develop innovative applications.

Bibliography: https://arxiv.org/abs/2404.09990 https://openreview.net/forum?id=mZptYYttFj https://thefllood.github.io/HQEdit_web/ https://arxiv.org/html/2411.04713v1 https://proceedings.neurips.cc/paper_files/paper/2023/file/64008fa30cba9b4d1ab1bd3bd3d57d61-Paper-Datasets_and_Benchmarks.pdf https://github.com/UCSC-VLAA/HQ-Edit https://github.com/HaozheZhao/UltraEdit https://ultra-editing.github.io/