반응형 Paper Review/Computer Vision2 [논문리뷰] AffordanceLLM : Grounding Affordance from Vision Language Models AffordanceLLM: Grounding Affordance from Vision Language ModelsAffordance grounding refers to the task of finding the area of an object with which one can interact. It is a fundamental but challenging task, as a successful solution requires the comprehensive understanding of a scene in multiple aspects including detecarxiv.org이번에 리뷰할 논문은 AffordanceLLM이다. 그러면 한번 시작해 보자.AbstractAffordance groundin.. 2025. 4. 30. [논문리뷰] InfiniteYou : Flexible Photo Recrafting While Preserving Your Identity 이번에 리뷰를 작성할 논문은 지난달 말에 나온 Photo Recrafting 논문인 InfiniteYou이다. InfiniteYou: Flexible Photo Recrafting While Preserving Your IdentityAchieving flexible and high-fidelity identity-preserved image generation remains formidable, particularly with advanced Diffusion Transformers (DiTs) like FLUX. We introduce InfiniteYou (InfU), one of the earliest robust frameworks leveraging DiTs for thisarxiv.orgAb.. 2025. 4. 29. 이전 1 다음 반응형