본문 바로가기
반응형

분류 전체보기9

[논문리뷰] AffordanceLLM : Grounding Affordance from Vision Language Models AffordanceLLM: Grounding Affordance from Vision Language ModelsAffordance grounding refers to the task of finding the area of an object with which one can interact. It is a fundamental but challenging task, as a successful solution requires the comprehensive understanding of a scene in multiple aspects including detecarxiv.org이번에 리뷰할 논문은 AffordanceLLM이다. 그러면 한번 시작해 보자.AbstractAffordance groundin.. 2025. 4. 30.
[논문리뷰] InfiniteYou : Flexible Photo Recrafting While Preserving Your Identity 이번에 리뷰를 작성할 논문은 지난달 말에 나온 Photo Recrafting 논문인 InfiniteYou이다. InfiniteYou: Flexible Photo Recrafting While Preserving Your IdentityAchieving flexible and high-fidelity identity-preserved image generation remains formidable, particularly with advanced Diffusion Transformers (DiTs) like FLUX. We introduce InfiniteYou (InfU), one of the earliest robust frameworks leveraging DiTs for thisarxiv.orgAb.. 2025. 4. 29.
[논문리뷰] DDT : Decoupled Diffusion Trasnformer DDT: Decoupled Diffusion TransformerDiffusion transformers have demonstrated remarkable generation quality, albeit requiring longer training iterations and numerous inference steps. In each denoising step, diffusion transformers encode the noisy inputs to extract the lower-frequency semanticarxiv.org오늘 리뷰할 논문은 DDT라고 해서, 기존의 DiT / SiT 등 Transformer 기반의 Diffusion Process에서의 한계점을 극복하기 위한 논문이라고 할 수 .. 2025. 4. 17.
[논문리뷰] EXAONE Deep: Reasoning Enhanced Language Models EXAONE Deep: Reasoning Enhanced Language ModelsWe present EXAONE Deep series, which exhibits superior capabilities in various reasoning tasks, including math and coding benchmarks. We train our models mainly on the reasoning-specialized dataset that incorporates long streams of thought processes. Evaluarxiv.org오늘(2025.3.18) LG Research Team에서 EXAONE Deep을 발표하였다. 이는 현재 트렌드에 맞추어 Reasoning 성능을 대폭 향.. 2025. 3. 18.
[논문 리뷰] R.A.C.E : Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion Model R.A.C.E.: Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion ModelIn the evolving landscape of text-to-image (T2I) diffusion models, the remarkable capability to generate high-quality images from textual descriptions faces challenges with the potential misuse of reproducing sensitive content. To address this critical issarxiv.org이번에는 컨택드렸던 자대 교수님의 논문을 가져와봤다. 내가 주로 관심가지고 연구하는 분.. 2025. 3. 17.
[논문 리뷰] AoT : Atom of Thoughts for Markov LLM Test-Time Scaling Atom of Thoughts for Markov LLM Test-Time ScalingLarge Language Models (LLMs) achieve superior performance through training-time scaling, and test-time scaling further enhances their capabilities by conducting effective reasoning during inference. However, as the scale of reasoning increases, existing tearxiv.orgChain-of-Thought의 새로운 지평이라고 하여 AoT, Atom of Thoughts가 새롭게 제안되었다. Question에 대해 Decomp.. 2025. 3. 11.
반응형