-
DRIVE: Data Curation Best Practices For Reinforcement Learning With Verifiable Reward In Competitive Code Generation
(2025)
β’ No Venue
Zhu et al.
-
Multiagentbench: Evaluating The Collaboration And Competition Of LLM Agents
(2025)
β’ No Venue
Zhu et al.
-
Longwriter-v: Enabling Ultra-long And High-fidelity Generation In Vision-language Models
(2025)
β’ No Venue
Tu et al.
-
Time Blindness: Why Video-language Models Can't See What Humans Can?
(2025)
β’ No Venue
Upadhyay et al.
-
Drivel-ology: Challenging Llms With Interpreting Nonsense With Depth
(2025)
β’ No Venue
Wang et al.
-
CODESYNC: Synchronizing Large Language Models With Dynamic Code Evolution At Scale
(2025)
β’ No Venue
Wang et al.
-
Chain-of-retrieval Augmented Generation
(2025)
β’ No Venue
Wang et al.
-
Cinemaster: A 3d-aware And Controllable Framework For Cinematic Text-to-video Generation
(2025)
β’ No Venue
Wang et al.
-
Cmphysbench: A Benchmark For Evaluating Large Language Models In Condensed Matter Physics
(2025)
β’ No Venue
Wang et al.
-
Critique Fine-tuning: Learning To Critique Is More Effective Than Learning To Imitate
(2025)
β’ No Venue
Yubo Wang, Xiang Yue, Wenhu Chen
-
Coser: Coordinating Llm-based Persona Simulation Of Established Roles
(2025)
β’ No Venue
Wang et al.
-
Fantasyportrait: Enhancing Multi-character Portrait Animation With Expression-augmented Diffusion Transformers
(2025)
β’ No Venue
Wang et al.
-
Fostering Video Reasoning Via Next-event Prediction
(2025)
β’ No Venue
Wang et al.
-
GPT-IMAGE-EDIT-1.5M: A Million-scale, Gpt-generated Image Dataset
(2025)
β’ No Venue
Wang et al.
-
Internsvg: Towards Unified SVG Tasks With Multimodal Large Language Models
(2025)
β’ No Venue
Wang et al.
-
F2LLM Technical Report: Matching SOTA Embedding Performance With 6 Million Open-source Data
(2025)
β’ No Venue
Zhang et al.
-
Megamath: Pushing The Limits Of Open Math Corpora
(2025)
β’ No Venue
Zhou et al.
-
Neural-driven Image Editing
(2025)
β’ No Venue
Zhou et al.
-
Omniworld: A Multi-domain And Multi-modal Dataset For 4D World Modeling
(2025)
β’ No Venue
Zhou et al.
-
Roborefer: Towards Spatial Referring With Reasoning In Vision-language Models For Robotics
(2025)
β’ No Venue
Zhou et al.
-
Phi-ground Tech Report: Advancing Perception In GUI Grounding
(2025)
β’ No Venue
Zhang et al.
-
GKG-LLM: A Unified Framework For Generalized Knowledge Graph Construction
(2025)
β’ No Venue
Zhang et al.
-
Qwen3 Embedding: Advancing Text Embedding And Reranking Through Foundation Models
(2025)
β’ No Venue
Zhang et al.
-
Speakervid-5m: A Large-scale High-quality Dataset For Audio-visual Dyadic Interactive Human Generation
(2025)
β’ No Venue
Zhang et al.
-
Videollama 3: Frontier Multimodal Foundation Models For Image And Video Understanding
(2025)
β’ No Venue
Zhang et al.
-
Unified Multimodal Understanding And Generation Models: Advances, Challenges, And Opportunities
(2025)
β’ No Venue
Zhang et al.
-
Babel: Open Multilingual Large Language Models Serving Over 90% Of Global Speakers
(2025)
β’ No Venue
Zhao et al.
-
Omnialign-v: Towards Enhanced Alignment Of Mllms With Human Preference
(2025)
β’ No Venue
Zhao et al.
-
Lex-art: Rethinking Text Generation Via Scalable High-quality Data Synthesis
(2025)
β’ No Venue
Zhao et al.
-
R1-omni: Explainable Omni-multimodal Emotion Recognition With Reinforcing Learning
(2025)
β’ No Venue
Jiaxing Zhao, Xihan Wei, Liefeng Bo
-
Promptcot 2.0: Scaling Prompt Synthesis For Large Language Model Reasoning
(2025)
β’ No Venue
Zhao et al.
-
One Token To Fool Llm-as-a-judge
(2025)
β’ No Venue
Zhao et al.
-
SAIL-VL2 Technical Report
(2025)
β’ No Venue
Yin et al.
-
Aligning Multimodal LLM With Human Preference: A Survey
(2025)
β’ No Venue
Yu et al.
-
Demystifying Reinforcement Learning In Agentic Reasoning
(2025)
β’ No Venue
Yu et al.
-
How Far Are Vlms From Visual Spatial Intelligence? A Benchmark-driven Perspective
(2025)
β’ No Venue
Yu et al.
-
Vrbench: A Benchmark For Multi-step Reasoning In Long Narrative Videos
(2025)
β’ No Venue
Yu et al.
-
Unicorn: Text-only Data Synthesis For Vision Language Model Training
(2025)
β’ No Venue
Yu et al.
-
Z1: Efficient Test-time Scaling With Code
(2025)
β’ No Venue
Yu et al.
-
Agent-r: Training Language Model Agents To Reflect Via Iterative Self-training
(2025)
β’ No Venue
Yuan et al.
-
Sa2va: Marrying SAM2 With Llava For Dense Grounded Understanding Of Images And Videos
(2025)
β’ No Venue
Yuan et al.
-
Refeed: Multi-dimensional Summarization Refinement With Reflective Reasoning On Feedback
(2025)
β’ No Venue
Yun et al.
-
Multi-swe-bench: A Multilingual Benchmark For Issue Resolving
(2025)
β’ No Venue
Zan et al.
-
Aralingbench A Human-annotated Benchmark For Evaluating Arabic Linguistic Capabilities Of Large Language Models
(2025)
β’ No Venue
Zbib et al.
-
A Vision-language-action-critic Model For Robotic Real-world Reinforcement Learning
(2025)
β’ No Venue
Zhai et al.
-
Skywork-swe: Unveiling Data Scaling Laws For Software Engineering In Llms
(2025)
β’ No Venue
Zeng et al.
-
2.5 Years In Class: A Multimodal Textbook For Vision-language Pretraining
(2025)
β’ No Venue
Zhang et al.
-
Bee: A High-quality Corpus And Full-stack Suite To Unlock Advanced Fully Open Mllms
(2025)
β’ No Venue
Zhang et al.
-
Basereward: A Strong Baseline For Multimodal Reward Model
(2025)
β’ No Venue
Zhang et al.
-
Autoenv: Automated Environments For Measuring Cross-environment Agent Learning
(2025)
β’ No Venue
Zhang et al.
-
Domain2vec: Vectorizing Datasets To Find The Optimal Data Mixture Without Training
(2025)
β’ No Venue
Zhang et al.
-
Mathcoder-vl: Bridging Vision And Code For Enhanced Multimodal Mathematical Reasoning
(2025)
β’ No Venue
Wang et al.
-
Multishotmaster: A Controllable Multi-shot Video Generation Framework
(2025)
β’ No Venue
Wang et al.
-
Mr-align: Meta-reasoning Informed Factuality Alignment For Large Reasoning Models
(2025)
β’ No Venue
Wang et al.
-
Opencua: Open Foundations For Computer-use Agents
(2025)
β’ No Venue
Wang et al.
-
Skywork-vl Reward: An Effective Reward Model For Multimodal Understanding And Reasoning
(2025)
β’ No Venue
Wang et al.
-
Pref-grpo: Pairwise Preference Reward-based GRPO For Stable Text-to-image Reinforcement Learning
(2025)
β’ No Venue
Wang et al.
-
Roboomni: Proactive Robot Manipulation In Omni-modal Context
(2025)
β’ No Venue
Wang et al.
-
Scaling Pre-training To One Hundred Billion Data For Vision Language Models
(2025)
β’ No Venue
Wang et al.
-
Textatlas5m: A Large-scale Dataset For Dense Text Image Generation
(2025)
β’ No Venue
Wang et al.
-
Finevision: Open Data Is All You Need
(2025)
β’ No Venue
Wiedmann et al.
-
Vision-zero: Scalable VLM Self-improvement Via Strategic Gamified Self-play
(2025)
β’ No Venue
Wang et al.
-
Video-thinker: Sparking "thinking With Videos" Via Reinforcement Learning
(2025)
β’ No Venue
Wang et al.
-
Worldpm: Scaling Human Preference Modeling
(2025)
β’ No Venue
Wang et al.
-
Mocha: Towards Movie-grade Talking Character Synthesis
(2025)
β’ No Venue
Wei et al.
-
Rank1: Test-time Compute For Reranking In Information Retrieval
(2025)
β’ No Venue
Weller et al.
-
Seq Vs Seq: An Open Suite Of Paired Encoders And Decoders
(2025)
β’ No Venue
Weller et al.
-
3D Scene Generation: A Survey
(2025)
β’ No Venue
Wen et al.
-
Spot The Fake: Large Multimodal Model-based Synthetic Image Detection With Artifact Explanation
(2025)
β’ No Venue
Wen et al.
-
Widesearch: Benchmarking Agentic Broad Info-seeking
(2025)
β’ No Venue
Wong et al.
-
Lightgen: Efficient Image Generation Through Knowledge Distillation And Direct Preference Optimization
(2025)
β’ No Venue
Wu et al.
-
Less-to-more Generalization: Unlocking More Controllability By In-context Generation
(2025)
β’ No Venue
Wu et al.
-
Any2caption:interpreting Any Condition To Caption For Controllable Video Generation
(2025)
β’ No Venue
Wu et al.
-
Omnigen2: Exploration To Advanced Multimodal Generation
(2025)
β’ No Venue
Wu et al.
-
Reasoning Or Memorization? Unreliable Results Of Reinforcement Learning Due To Data Contamination
(2025)
β’ No Venue
Wu et al.
-
Qwen-image Technical Report
(2025)
β’ No Venue
Wu et al.
-
Spatial-mllm: Boosting MLLM Capabilities In Visual-based Spatial Intelligence
(2025)
β’ No Venue
Wu et al.
-
Writingbench: A Comprehensive Benchmark For Generative Writing
(2025)
β’ No Venue
Wu et al.
-
BMMR: A Large-scale Bilingual Multimodal Multi-discipline Reasoning Dataset
(2025)
β’ No Venue
Xi et al.
-
Dense Retrievers Can Fail On Simple Queries: Revealing The Granularity Dilemma Of Embeddings
(2025)
β’ No Venue
Xu et al.
-
Leetcodedataset: A Temporal Dataset For Robust Evaluation And Efficient Training Of Code Llms
(2025)
β’ No Venue
Xia et al.
-
Open Data Synthesis For Deep Research
(2025)
β’ No Venue
Xia et al.
-
Retrieval-augmented Large Language Models For Financial Time Series Forecasting
(2025)
β’ No Venue
Xiao et al.
-
MIEB: Massive Image Embedding Benchmark
(2025)
β’ No Venue
Xiao et al.
-
Ui-genie: A Self-improving Approach For Iteratively Boosting Mllm-based Mobile GUI Agents
(2025)
β’ No Venue
Xiao et al.
-
Are Vlms Ready For Autonomous Driving? An Empirical Study From The Reliability, Data, And Metric Perspectives
(2025)
β’ No Venue
Xie et al.
-
Llms Can Get "brain Rot"!
(2025)
β’ No Venue
Xing et al.
-
Jodi: Unification Of Visual Generation And Understanding Via Joint Modeling
(2025)
β’ No Venue
Xu et al.
-
Kodcode: A Diverse, Challenging, And Verifiable Synthetic Dataset For Coding
(2025)
β’ No Venue
Xu et al.
-
Mind The Gap: Bridging Thought Leap For Improved Chain-of-thought Tuning
(2025)
β’ No Venue
Xu et al.
-
Visulogic: A Benchmark For Evaluating Visual Reasoning In Multi-modal Large Language Models
(2025)
β’ No Venue
Xu et al.
-
TOUCAN: Synthesizing 1.5M Tool-agentic Data From Real-world MCP Environments
(2025)
β’ No Venue
Xu et al.
-
Audio-flan: A Preliminary Release
(2025)
β’ No Venue
Xue et al.
-
Withanyone: Towards Controllable And ID Consistent Image Generation
(2025)
β’ No Venue
Xu et al.
-
Oceangym: A Benchmark Environment For Underwater Embodied Agents
(2025)
β’ No Venue
Xue et al.
-
Gpt-imgeval: A Comprehensive Benchmark For Diagnosing Gpt4o In Image Generation
(2025)
β’ No Venue
Yan et al.
-
Egolife: Towards Egocentric Life Assistant
(2025)
β’ No Venue
Yang et al.
-
Magma: A Foundation Model For Multimodal AI Agents
(2025)
β’ No Venue
Yang et al.
-
Steering Vision-language-action Models As Anti-exploration: A Test-time Scaling Approach
(2025)
β’ No Venue
Yang et al.
-
Table-r1: Inference-time Scaling For Table Reasoning
(2025)
β’ No Venue
Yang et al.
-
Too Good To Be Bad: On The Failure Of Llms To Role-play Villains
(2025)
β’ No Venue
Yi et al.
-
Through-the-mask: Mask-based Motion Trajectories For Image-to-video Generation
(2025)
β’ No Venue
Yariv et al.
-
Echo-4o: Harnessing The Power Of Gpt-4o Synthetic Images For Improved Image Generation
(2025)
β’ No Venue
Ye et al.
-
Seeing From Another Perspective: Evaluating Multi-view Understanding In Mllms
(2025)
β’ No Venue
Yeh et al.
-
Primitiveanything: Human-crafted 3D Primitive Assembly Generation With Auto-regressive Transformer
(2025)
β’ No Venue
Ye et al.
-
Shapellm-omni: A Native Multimodal LLM For 3D Generation And Understanding
(2025)
β’ No Venue
Ye et al.
-
Phi-4-mini Technical Report: Compact Yet Powerful Multimodal Language Models Via Mixture-of-loras
(2025)
β’ No Venue
Abouelenin et al.
-
Emergent Misalignment Via In-context Learning: Narrow In-context Examples Can Produce Broadly Misaligned Llms
(2025)
β’ No Venue
Afonin et al.
-
Language Models' Factuality Depends On The Language Of Inquiry
(2025)
β’ No Venue
Aggarwal et al.
-
Essential-web V1.0: 24T Tokens Of Organized Web Data
(2025)
β’ No Venue
Ai et al.
-
Sadeed: Advancing Arabic Diacritization Through Small Language Model
(2025)
β’ No Venue
Aldallal et al.
-
Atla Selene Mini: A General Purpose Evaluation Model
(2025)
β’ No Venue
Alexandru et al.
-
Smollm2: When Smol Goes Big -- Data-centric Training Of A Small Language Model
(2025)
β’ No Venue
Allal et al.
-
Amo-bench: Large Language Models Still Struggle In High School Math Competitions
(2025)
β’ No Venue
An et al.
-
Llava-onevision-1.5: Fully Open Framework For Democratized Multimodal Training
(2025)
β’ No Venue
An et al.
-
Herobench: A Benchmark For Long-horizon Planning And Structured Reasoning In Virtual Worlds
(2025)
β’ No Venue
Anokhin et al.
-
Tabstar: A Foundation Tabular Model With Semantically Target-aware Representations
(2025)
β’ No Venue
Alan Arazi, Eilam Shapira, Roi Reichart
-
Towards Best Practices For Open Datasets For LLM Training
(2025)
β’ No Venue
Baack et al.
-
Swe-rebench: An Automated Pipeline For Task Collection And Decontaminated Evaluation Of Software Engineering Agents
(2025)
β’ No Venue
Badertdinov et al.
-
Eurobert: Scaling Multilingual Encoders For European Languages
(2025)
β’ No Venue
Boizard et al.
-
A Data-centric Framework For Addressing Phonetic And Prosodic Challenges In Russian Speech Generative Models
(2025)
β’ No Venue
Borodin et al.
-
Video Action Differencing
(2025)
β’ No Venue
Burgess et al.
-
Microvqa: A Multimodal Reasoning Benchmark For Microscopy-based Scientific Research
(2025)
β’ No Venue
Burgess et al.
-
Crowdsource, Crawl, Or Generate? Creating SEA-VL, A Multicultural Vision-language Dataset For Southeast Asia
(2025)
β’ No Venue
Cahyawijaya et al.
-
MORSE-500: A Programmatically Controllable Video Benchmark To Stress-test Multimodal Reasoning
(2025)
β’ No Venue
Cai et al.
-
Web-shepherd: Advancing Prms For Reinforcing Web Agents
(2025)
β’ No Venue
Chae et al.
-
Webscale-rl: Automated Data Pipeline For Scaling RL Data To Pretraining Levels
(2025)
β’ No Venue
Cen et al.
-
A3: Android Agent Arena For Mobile GUI Agents
(2025)
β’ No Venue
Chai et al.
-
Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-mesh Representation, And Evaluation Metrics
(2025)
β’ No Venue
Chae-Yeon et al.
-
Game-time: Evaluating Temporal Dynamics In Spoken Language Models
(2025)
β’ No Venue
Chang et al.
-
Humo: Human-centric Video Generation Via Collaborative Multi-modal Conditioning
(2025)
β’ No Venue
Chen et al.
-
Blip3-o: A Family Of Fully Open Unified Multimodal Models-architecture, Training And Dataset
(2025)
β’ No Venue
Chen et al.
-
Code2video: A Code-centric Paradigm For Educational Video Generation
(2025)
β’ No Venue
Yanzhe Chen, Kevin Qinghong Lin, Mike Zheng Shou
-
Halumem: Evaluating Hallucinations In Memory Systems Of Agents
(2025)
β’ No Venue
Chen et al.
-
FINEREASON: Evaluating And Improving Llms' Deliberate Reasoning Through Reflective Puzzle Solving
(2025)
β’ No Venue
Chen et al.
-
Fusionaudio-1.2m: Towards Fine-grained Audio Captioning With Multimodal Contextual Fusion
(2025)
β’ No Venue
Chen et al.
-
MIG: Automatic Data Selection For Instruction Tuning By Maximizing Information Gain In Semantic Space
(2025)
β’ No Venue
Chen et al.
-
Moca: Modality-aware Continual Pre-training Makes Better Bidirectional Multimodal Embeddings
(2025)
β’ No Venue
Chen et al.
-
Opengpt-4o-image: A Comprehensive Dataset For Advanced Image Generation And Editing
(2025)
β’ No Venue
Chen et al.
-
Paper2web: Let's Make Your Paper Alive!
(2025)
β’ No Venue
Chen et al.
-
Sharegpt-4o-image: Aligning Multimodal Models With Gpt-4o-level Image Generation
(2025)
β’ No Venue
Chen et al.
-
Xverify: Efficient Answer Verifier For Reasoning Model Evaluations
(2025)
β’ No Venue
Chen et al.
-
Ui-ins: Enhancing GUI Grounding With Multi-perspective Instruction-as-reasoning
(2025)
β’ No Venue
Chen et al.
-
Videovista-culturallingo: 360^circ Horizons-bridging Cultures, Languages, And Domains In Video Comprehension
(2025)
β’ No Venue
Chen et al.
-
Multimodal Evaluation Of Russian-language Architectures
(2025)
β’ No Venue
Chervyakov et al.
-
System Prompt Optimization With Meta-learning
(2025)
β’ No Venue
Yumin Choi, Jinheon Baek, Sung Ju Hwang
-
Instruction-guided Lesion Segmentation For Chest X-rays With Automatically Generated Large-scale Dataset
(2025)
β’ No Venue
Choi et al.
-
WEAVE: Unleashing And Benchmarking The In-context Interleaved Comprehension And Generation
(2025)
β’ No Venue
Chow et al.
-
Overview Of The TREC 2021 Deep Learning Track
(2025)
β’ Arxiv
β’ 58 citations
Craswell et al.
-
This Time Is Different: An Observability Perspective On Time Series Foundation Models
(2025)
β’ No Venue
Cohen et al.
-
Reinforcement Learning For Reasoning In Small Llms: What Works And What Doesn't
(2025)
β’ No Venue
Quy-Anh Dang, Chris Ngo
-
Meshcoder: Llm-powered Structured Mesh Code Generation From Point Clouds
(2025)
β’ No Venue
Dai et al.
-
Toolscope: An Agentic Framework For Vision-guided And Long-horizon Tool Use
(2025)
β’ No Venue
Mengjie Deng, Guanting Dong, Zhicheng Dou
-
Self-improvement In Multimodal Large Language Models: A Survey
(2025)
β’ No Venue
Deng et al.
-
CLIMB: Clustering-based Iterative Data Mixture Bootstrapping For Language Model Pre-training
(2025)
β’ No Venue
Diao et al.
-
Mmdocir: Benchmarking Multi-modal Retrieval For Long Documents
(2025)
β’ No Venue
Dong et al.
-
Motionsight: Boosting Fine-grained Motion Understanding In Multimodal Llms
(2025)
β’ No Venue
Du et al.
-
Megascience: Pushing The Frontiers Of Post-training Datasets For Science Reasoning
(2025)
β’ No Venue
Run-Ze Fan, Zengzhi Wang, Pengfei Liu
-
Missing Premise Exacerbates Overthinking: Are Reasoning Models Losing Critical Thinking Skill?
(2025)
β’ No Venue
Fan et al.
-
Flux-reason-6m & Prism-bench: A Million-scale Text-to-image Reasoning Dataset And Comprehensive Benchmark
(2025)
β’ No Venue
Fang et al.
-
Got: Unleashing Reasoning Capability Of Multimodal Large Language Model For Visual Generation And Editing
(2025)
β’ No Venue
Fang et al.
-
Grounding Computer Use Agents On Human Demonstrations
(2025)
β’ No Venue
Feizi et al.
-
Can Mllms Guide Me Home? A Benchmark Study On Fine-grained Visual Reasoning From Transit Maps
(2025)
β’ No Venue
Feng et al.
-
WILDCHAT-50M: A Deep Dive Into The Role Of Synthetic Data In Post-training
(2025)
β’ No Venue
Benjamin Feuer, Chinmay Hegde
-
Video-r1: Reinforcing Video Reasoning In Mllms
(2025)
β’ No Venue
Feng et al.
-
Listener-rewarded Thinking In Vlms For Image Preferences
(2025)
β’ No Venue
Gambashidze et al.
-
Cognitive Behaviors That Enable Self-improving Reasoners, Or, Four Habits Of Highly Effective Stars
(2025)
β’ No Venue
Gandhi et al.
-
A Strategic Coordination Framework Of Small Llms Matches Large Llms In Data Synthesis
(2025)
β’ No Venue
Gao et al.
-
R&B: Domain Regrouping And Data Mixture Balancing For Efficient Foundation Model Training
(2025)
β’ No Venue
Ge et al.
-
Arc-hunyuan-video-7b: Structured Video Comprehension Of Real-world Shorts
(2025)
β’ No Venue
Ge et al.
-
Audio Flamingo 2: An Audio-language Model With Long-audio Understanding And Expert Reasoning Abilities
(2025)
β’ No Venue
Ghosh et al.
-
Lment: A Suite For Analyzing Knowledge In Language Models From Pretraining Data To Representations
(2025)
β’ No Venue
Gottesman et al.
-
Openthoughts: Data Recipes For Reasoning Models
(2025)
β’ No Venue
Guha et al.
-
ACADREASON: Exploring The Limits Of Reasoning Models With Academic Research Problems
(2025)
β’ No Venue
Gui et al.
-
Swe-factory: Your Automated Factory For Issue Resolution Training Data And Evaluation Benchmarks
(2025)
β’ No Venue
Guo et al.
-
Beyond The Last Answer: Your Reasoning Trace Uncovers More Than You Think
(2025)
β’ No Venue
Hasan Abed Al Kader Hammoud, Hani Itani, Bernard Ghanem
-
Mesatask: Towards Task-driven Tabletop Scene Generation Via 3D Spatial Reasoning
(2025)
β’ No Venue
Hao et al.
-
MAGA: Massive Genre-audience Reformulation To Pretraining Corpus Expansion
(2025)
β’ No Venue
Xintong Hao, Ke Shen, Chenggang Li
-
Unireditbench: A Unified Reasoning-based Image Editing Benchmark
(2025)
β’ No Venue
Han et al.
-
Learnings From Scaling Visual Tokenizers For Reconstruction And Generation
(2025)
β’ No Venue
Hansen-Estruch et al.
-
Pasa: An LLM Agent For Comprehensive Academic Paper Search
(2025)
β’ No Venue
He et al.
-
Hardtests: Synthesizing High-quality Test Cases For LLM Coding
(2025)
β’ No Venue
He et al.
-
Videossr: Video Self-supervised Reinforcement Learning
(2025)
β’ No Venue
He et al.
-
CASS: Nvidia To AMD Transpilation With Data, Models, And Benchmark
(2025)
β’ No Venue
Heakl et al.
-
Mutarjim: Advancing Bidirectional Arabic-english Translation With A Small Language Model
(2025)
β’ No Venue
Hennara et al.
-
Wasm: A Pipeline For Constructing Structured Arabic Interleaved Multimodal Corpora
(2025)
β’ No Venue
Hennara et al.
-
Charting And Navigating Hugging Face's Model Atlas
(2025)
β’ No Venue
Horwitz et al.
-
Quest: Incentivizing Llms To Generate Difficult Problems
(2025)
β’ No Venue
Hu et al.
-
Finsearchcomp: Towards A Realistic, Expert-level Evaluation Of Financial Search And Reasoning
(2025)
β’ No Venue
Hu et al.
-
A Survey Of Scientific Large Language Models: From Data Foundations To Agent Frontiers
(2025)
β’ No Venue
Hu et al.
-
Video-mmmu: Evaluating Knowledge Acquisition From Multi-discipline Professional Videos
(2025)
β’ No Venue
Hu et al.
-
Benchmax: A Comprehensive Multilingual Evaluation Suite For Large Language Models
(2025)
β’ No Venue
Huang et al.
-
Loong: Synthesize Long Chain-of-thoughts At Scale Through Verifiers
(2025)
β’ No Venue
Huang et al.
-
Vision-r1: Incentivizing Reasoning Capability In Multimodal Large Language Models
(2025)
β’ No Venue
Huang et al.
-
Vistadpo: Video Hierarchical Spatial-temporal Direct Preference Optimization For Large Video Models
(2025)
β’ No Venue
Huang et al.
-
Sentinel: SOTA Model To Protect Against Prompt Injections
(2025)
β’ No Venue
Dror Ivry, Oran Nahum
-
The African Languages Lab: A Collaborative Approach To Advancing Low-resource African NLP
(2025)
β’ No Venue
Issaka et al.
-
Ambik: Dataset Of Ambiguous Tasks In Kitchen Environment
(2025)
β’ No Venue
Ivanova et al.
-
Reasoning Model Is Stubborn: Diagnosing Instruction Overriding In Reasoning Models
(2025)
β’ No Venue
Jang et al.
-
Adaptive Multi-agent Response Refinement In Conversational Systems
(2025)
β’ No Venue
Jeong et al.
-
CSVQA: A Chinese Multimodal Benchmark For Evaluating STEM Reasoning Capabilities Of Vlms
(2025)
β’ No Venue
Jian et al.
-
Omnispatial: Towards Comprehensive Spatial Reasoning Benchmark For Vision Language Models
(2025)
β’ No Venue
Jia et al.
-
Visualwebinstruct: Scaling Up Multimodal Instruction Data Through Web Search
(2025)
β’ No Venue
Jia et al.
-
Rynnvla-001: Using Human Demonstrations To Improve Robot Manipulation
(2025)
β’ No Venue
Jiang et al.
-
Omni-reward: Towards Generalist Omni-modal Reward Modeling With Free-form Preferences
(2025)
β’ No Venue
Jin et al.
-
Expect The Unexpected: Failsafe Long Context QA For Finance
(2025)
β’ No Venue
Kamble et al.
-
The Common Pile V0.1: An 8TB Dataset Of Public Domain And Openly Licensed Text
(2025)
β’ No Venue
Kandpal et al.
-
First Try Matters: Revisiting The Role Of Reflection In Reasoning Models
(2025)
β’ No Venue
Kang et al.
-
LEGION: Learning To Ground And Explain For Synthetic Image Detection
(2025)
β’ No Venue
Kang et al.
-
Robot-r1: Reinforcement Learning For Enhanced Embodied Reasoning In Robotics
(2025)
β’ No Venue
Kim et al.
-
Mol-llama: Towards General Understanding Of Molecules In Large Molecular Language Model
(2025)
β’ No Venue
Dongki Kim, Wonbin Lee, Sung Ju Hwang
-
From Scores To Skills: A Cognitive Diagnosis Framework For Evaluating Financial Large Language Models
(2025)
β’ No Venue
Kuang et al.
-
Nohumansrequired: Autonomous High-quality Image Editing Triplet Mining
(2025)
β’ No Venue
Kuprashevich et al.
-
Opensir: Open-ended Self-improving Reasoner
(2025)
β’ No Venue
Kwan et al.
-
Mini-o3: Scaling Up Reasoning Patterns And Interaction Turns For Visual Search
(2025)
β’ No Venue
Lai et al.
-
Rethinking Reward Models For Multi-domain Test-time Scaling
(2025)
β’ No Venue
Lee et al.
-
Stream3r: Scalable Sequential 3D Reconstruction With Causal Transformer
(2025)
β’ No Venue
Lan et al.
-
MMR1: Enhancing Multimodal Reasoning With Variance-aware Sampling And Open Resources
(2025)
β’ No Venue
Leng et al.
-
Miromind-m1: An Open-source Advancement In Mathematical Reasoning Via Context-aware Multi-stage Policy Optimization
(2025)
β’ No Venue
Li et al.
-
IGGT: Instance-grounded Geometry Transformer For Semantic 3D Reconstruction
(2025)
β’ No Venue
Li et al.
-
Droplet3d: Commonsense Priors From Videos Facilitate 3D Generation
(2025)
β’ No Venue
Li et al.
-
Drafterbench: Benchmarking Large Language Models For Tasks Automation In Civil Engineering
(2025)
β’ No Venue
Yinsheng Li, Zhen Dong, Yi Shao
-
Migician: Revealing The Magic Of Free-form Multi-image Grounding In Multimodal Large Language Models
(2025)
β’ No Venue
Li et al.
-
Ovo-bench: How Far Is Your Video-llms From Real-world Online Video Understanding?
(2025)
β’ No Venue
Li et al.
-
Sos1: O1 And R1-like Reasoning Llms Are Sum-of-square Solvers
(2025)
β’ No Venue
Li et al.
-
SWE-SQL: Illuminating LLM Pathways To Solve User SQL Issues In Real-world Applications
(2025)
β’ No Venue
Li et al.
-
Temporal Preference Optimization For Long-form Video Understanding
(2025)
β’ No Venue
Li et al.
-
Truth In The Few: High-value Data Selection For Efficient Multi-modal Reasoning
(2025)
β’ No Venue
Li et al.
-
Zebra-cot: A Dataset For Interleaved Vision Language Reasoning
(2025)
β’ No Venue
Li et al.
-
Describe Anything: Detailed Localized Image And Video Captioning
(2025)
β’ No Venue
Lian et al.
-
Modomodo: Multi-domain Data Mixtures For Multimodal LLM Reinforcement Learning
(2025)
β’ No Venue
Liang et al.
-
URECA: Unique Region Caption Anything
(2025)
β’ No Venue
Lim et al.
-
Embrace-3k: Embodied Reasoning And Action In Complex Environments
(2025)
β’ No Venue
Lin et al.
-
Partcrafter: Structured 3D Mesh Generation Via Compositional Latent Diffusion Transformers
(2025)
β’ No Venue
Lin et al.
-
Ost-bench: Evaluating The Capabilities Of Mllms In Online Spatio-temporal Scene Understanding
(2025)
β’ No Venue
Lin et al.
-
Towards Understanding Camera Motions In Any Video
(2025)
β’ No Venue
Lin et al.
-
Beyond Distillation: Pushing The Limits Of Medical LLM Reasoning With Minimalist Rule-based RL
(2025)
β’ No Venue
Liu et al.
-
Llm-powered GUI Agents In Phone Automation: Surveying Progress And Prospects
(2025)
β’ No Venue
Liu et al.
-
Langscene-x: Reconstruct Generalizable 3D Language-embedded Scenes With Trimap Video Diffusion
(2025)
β’ No Venue
Liu et al.
-
Shotbench: Expert-level Cinematic Understanding In Vision-language Models
(2025)
β’ No Venue
Liu et al.
-
Part I: Tricks Or Traps? A Deep Dive Into RL For LLM Reasoning
(2025)
β’ No Venue
Liu et al.
-
Pairwise RM: Perform Best-of-n Sampling With Knockout Tournament
(2025)
β’ No Venue
Liu et al.
-
Quadmix: Quality-diversity Balanced Data Selection For Efficient LLM Pretraining
(2025)
β’ No Venue
Liu et al.
-
Points-reader: Distillation-free Adaptation Of Vision-language Models For Document Conversion
(2025)
β’ No Venue
Liu et al.
-
Rstar-coder: Scaling Competitive Code Reasoning With A Large-scale Verified Dataset
(2025)
β’ No Venue
Liu et al.
-
Scalecua: Scaling Open-source Computer Use Agents With Cross-platform Data
(2025)
β’ No Venue
Liu et al.
-
Skywork-reward-v2: Scaling Preference Data Curation Via Human-ai Synergy
(2025)
β’ No Venue
Liu et al.
-
Taking Notes Brings Focus? Towards Multi-turn Multimodal Dialogue Learning
(2025)
β’ No Venue
Liu et al.
-
Synlogic: Synthesizing Verifiable Reasoning Data At Scale For Learning Logical Reasoning And Beyond
(2025)
β’ No Venue
Liu et al.
-
Unimoe-audio: Unified Speech And Music Generation With Dynamic-capacity Moe
(2025)
β’ No Venue
Liu et al.
-
BIOMEDICA: An Open Biomedical Image-caption Archive, Dataset, And Vision-language Models Derived From Scientific Literature
(2025)
β’ No Venue
Lozano et al.
-
Elv-halluc: Benchmarking Semantic Aggregation Hallucinations In Long Video Understanding
(2025)
β’ No Venue
Lu et al.
-
Av-reasoner: Improving And Benchmarking Clue-grounded Audio-visual Counting For Mllms
(2025)
β’ No Venue
Lu et al.
-
Finmme: Benchmark Dataset For Financial Multi-modal Reasoning Evaluation
(2025)
β’ No Venue
Luo et al.
-
URSA: Understanding And Verifying Chain-of-thought Reasoning In Multimodal Mathematics
(2025)
β’ No Venue
Luo et al.
-
C3: A Bilingual Benchmark For Spoken Dialogue Models Exploring Challenges In Complex Conversations
(2025)
β’ No Venue
Chengqian Ma, Wei Tao, Yiwen Guo
-
General-reasoner: Advancing LLM Reasoning Across All Domains
(2025)
β’ No Venue
Ma et al.
-
Beyondweb: Lessons From Scaling Synthetic Data For Trillion-scale Pretraining
(2025)
β’ No Venue
Maini et al.
-
Wikivideo: Article Generation From Multiple Videos
(2025)
β’ No Venue
Martin et al.
-
Hard Negative Mining For Domain-specific Retrieval In Enterprise Systems
(2025)
β’ No Venue
Meghwani et al.
-
Swe-lancer: Can Frontier Llms Earn $1 Million From Real-world Freelance Software Engineering?
(2025)
β’ No Venue
Miserendino et al.
-
Synthdetoxm: Modern Llms Are Few-shot Parallel Detoxification Data Annotators
(2025)
β’ No Venue
Moskovskiy et al.
-
Do Generative Video Models Learn Physical Principles From Watching Videos?
(2025)
β’ No Venue
Motamed et al.
-
Smoldocling: An Ultra-compact Vision-language Model For End-to-end Multi-modal Document Conversion
(2025)
β’ No Venue
Nassar et al.
-
Annotation-efficient Universal Honesty Alignment
(2025)
β’ No Venue
Ni et al.
-
Viscoder2: Building Multi-language Visualization Coding Agents
(2025)
β’ No Venue
Ni et al.
-
Viscoder: Fine-tuning Llms For Executable Python Visualization Code Generation
(2025)
β’ No Venue
Ni et al.
-
Does Understanding Inform Generation In Unified Multimodal Models? From Analysis To Path Forward
(2025)
β’ No Venue
Niu et al.
-
Benchmarking Llms' Swarm Intelligence
(2025)
β’ No Venue
Ruan et al.
-
Large Language Models Meet Extreme Multi-label Classification: Scaling And Multi-modal Framework
(2025)
β’ No Venue
Ortego et al.
-
Paper2poster: Towards Multimodal Poster Automation From Scientific Papers
(2025)
β’ No Venue
Pang et al.
-
Mathfusion: Enhancing Mathematic Problem-solving Of LLM Through Instruction Fusion
(2025)
β’ No Venue
Pei et al.
-
Fineweb2: One Pipeline To Scale Them All -- Adapting Pre-training Data Processing To Every Language
(2025)
β’ No Venue
Penedo et al.
-
Multifinben: A Multilingual, Multimodal, And Difficulty-aware Benchmark For Financial LLM Evaluation
(2025)
β’ No Venue
Peng et al.
-
Plutus: Benchmarking Large Language Models In Low-resource Greek Finance
(2025)
β’ No Venue
Peng et al.
-
Humanity's Last Exam
(2025)
β’ No Venue
Phan et al.
-
An Open Recipe: Adapting Language-specific Llms To A Reasoning Model In One Day Via Model Merging
(2025)
β’ No Venue
Pipatanakul et al.
-
Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification To Improve Trustworthy QA
(2025)
β’ No Venue
Pletenev et al.
-
THOUGHTTERMINATOR: Benchmarking, Calibrating, And Mitigating Overthinking In Reasoning Models
(2025)
β’ No Venue
Pu et al.
-
Generating Physically Stable And Buildable LEGO Designs From Text
(2025)
β’ No Venue
Pun et al.
-
Sofar: Language-grounded Orientation Bridges Spatial Reasoning And Object Manipulation
(2025)
β’ No Venue
Qi et al.
-
Pico-banana-400k: A Large-scale Dataset For Text-guided Image Editing
(2025)
β’ No Venue
Qian et al.
-
Fino1: On The Transferability Of Reasoning Enhanced Llms To Finance
(2025)
β’ No Venue
Qian et al.
-
V-thinker: Interactive Thinking With Images
(2025)
β’ No Venue
Qiao et al.
-
We-math 2.0: A Versatile Mathbook System For Incentivizing Visual Mathematical Reasoning
(2025)
β’ No Venue
Qiao et al.
-
Animeshooter: A Multi-shot Animation Dataset For Reference-guided Video Generation
(2025)
β’ No Venue
Qiu et al.
-
Phybench: Holistic Evaluation Of Physical Perception And Reasoning In Large Language Models
(2025)
β’ No Venue
Qiu et al.
-
How Well Does Gpt-4o Understand Vision? Evaluating Multimodal Foundation Models On Standard Computer Vision Tasks
(2025)
β’ No Venue
Ramachandran et al.
-
Videomathqa: Benchmarking Mathematical Reasoning Via Multimodal Understanding In Videos
(2025)
β’ No Venue
Rasheed et al.
-
Anycap Project: A Unified Framework, Dataset, And Benchmark For Controllable Omni-modal Captioning
(2025)
β’ No Venue
Ren et al.
-
Zerobench: An Impossible Visual Benchmark For Contemporary Large Multimodal Models
(2025)
β’ No Venue
Roberts et al.
-
When Models Lie, We Learn: Multilingual Span-level Hallucination Detection With Psiloqa
(2025)
β’ No Venue
Rykov et al.
-
Dota-rag: Dynamic Of Thought Aggregation RAG
(2025)
β’ No Venue
Ruangtanusak et al.
-
Through The Looking Glass: Common Sense Consistency Evaluation Of Weird Images
(2025)
β’ No Venue
Rykov et al.
-
Aligning Text, Images, And 3D Structure Token-by-token
(2025)
β’ No Venue
Aadarsh Sahoo, Vansh Tibrewal, Georgia Gkioxari
-
Geopolitical Biases In Llms: What Are The "good" And The "bad" Countries According To Contemporary Language Models
(2025)
β’ No Venue
Salnikov et al.
-
ABC: Achieving Better Control Of Multimodal Embeddings Using Vlms
(2025)
β’ No Venue
Benjamin Schneider, Florian Kerschbaum, Wenhu Chen
-
Emonet-voice: A Fine-grained, Expert-verified Benchmark For Speech Emotion Detection
(2025)
β’ No Venue
Schuhmann et al.
-
Seedream 4.0: Toward Next-generation Multimodal Image Generation
(2025)
β’ No Venue
Seedream et al.
-
Reasonir: Training Retrievers For Reasoning Tasks
(2025)
β’ No Venue
Shao et al.
-
Solving Inequality Proofs With Large Language Models
(2025)
β’ No Venue
Sheng et al.
-
Phyx: Does Your Model Have The "wits" For Physical Reasoning?
(2025)
β’ No Venue
Shen et al.
-
Mathcanvas: Intrinsic Visual Chain-of-thought For Multimodal Mathematical Reasoning
(2025)
β’ No Venue
Shi et al.
-
Smolvla: A Vision-language-action Model For Affordable And Efficient Robotics
(2025)
β’ No Venue
Shukor et al.
-
Predictive Data Selection: The Data That Predicts Is The Data That Teaches
(2025)
β’ No Venue
Shum et al.
-
Dinov3
(2025)
β’ No Venue
SimΓ©oni et al.
-
Pushing On Multilingual Reasoning Models With Language-mixed Chain-of-thought
(2025)
β’ No Venue
Son et al.
-
Agent Data Protocol: Unifying Datasets For Diverse, Effective Fine-tuning Of LLM Agents
(2025)
β’ No Venue
Song et al.
-
DMM: Building A Versatile Image Generation Model Via Distillation-based Model Merging
(2025)
β’ No Venue
Song et al.
-
Makeanything: Harnessing Diffusion Transformers For Multi-domain Procedural Sequence Generation
(2025)
β’ No Venue
Yiren Song, Cheng Liu, Mike Zheng Shou
-
Alchemist: Turning Public Text-to-image Data Into Generative Gold
(2025)
β’ No Venue
Startsev et al.
-
Video-lmm Post-training: A Deep Dive Into Video Reasoning With Large Multimodal Models
(2025)
β’ No Venue
Tang et al.
-
Reasonmed: A 370K Multi-agent Generated Dataset For Advancing Medical Reasoning
(2025)
β’ No Venue
Sun et al.
-
Intrex: A Dataset For Modeling Engagement In Educational Conversations
(2025)
β’ No Venue
Tan et al.
-
Large Language Models For Data Synthesis
(2025)
β’ No Venue
Yihong Tang, Menglin Kong, Lijun Sun
-
Lingshu: A Generalist Foundation Model For Unified Multimodal Medical Understanding And Reasoning
(2025)
β’ No Venue
Team et al.
-
Personafeedback: A Large-scale Human-annotated Benchmark For Personalization
(2025)
β’ No Venue
Tao et al.
-
COIG-P: A High-quality And Large-scale Chinese Preference Dataset For Alignment With Human Values
(2025)
β’ No Venue
Team et al.
-
Minicpm4: Ultra-efficient Llms On End Devices
(2025)
β’ No Venue
Team et al.
-
Fixing Data That Hurts Performance: Cascading Llms To Relabel Hard Negatives For Robust Information Retrieval
(2025)
β’ No Venue
Thakur et al.
-
Audiox: Diffusion Transformer For Anything-to-audio Generation
(2025)
β’ No Venue
Tian et al.
-
MMMR: Benchmarking Massive Multi-modal Reasoning Tasks
(2025)
β’ No Venue
Tie et al.
-
Openmathinstruct-1: A 1.8 Million Math Instruction Tuning Dataset
(2024)
β’ No Venue
Toshniwal et al.
-
No "zero-shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
(2024)
β’ No Venue
Udandarao et al.
-
Replacing Judges With Juries: Evaluating LLM Generations With A Panel Of Diverse Models
(2024)
β’ No Venue
Verga et al.
-
One Missing Piece In Vision And Language: A Survey On Comics Understanding
(2024)
β’ No Venue
Vivoli et al.
-
Meltemi: The First Open Large Language Model For Greek
(2024)
β’ No Venue
Voukoutis et al.
-
Qwen2.5 Technical Report
(2024)
β’ No Venue
Qwen et al.
-
Maya: An Instruction Finetuned Multilingual Multimodal Model
(2024)
β’ No Venue
Alam et al.
-
Understanding Alignment In Multimodal Llms: A Comprehensive Study
(2024)
β’ No Venue
Amirloo et al.
-
Skyeyegpt: Unifying Remote Sensing Vision-language Tasks Via Instruction Tuning With Large Language Model
(2024)
β’ ISPRS Journal of Photogrammetry and Remote Sensing
β’ 51 citations
Yang Zhan, Zhitong Xiong, Yuan Yuan
-
Anygpt: Unified Multimodal LLM With Discrete Sequence Modeling
(2024)
β’ No Venue
Zhan et al.
-
Perplexed By Perplexity: Perplexity-based Data Pruning With Small Reference Models
(2024)
β’ No Venue
Ankner et al.
-
Chronos: Learning The Language Of Time Series
(2024)
β’ No Venue
Ansari et al.
-
Scenescript: Reconstructing Scenes With An Autoregressive Structured Language Model
(2024)
β’ No Venue
Avetisyan et al.
-
MINT-1T: Scaling Open-source Multimodal Data By 10x: A Multimodal Dataset With One Trillion Tokens
(2024)
β’ No Venue
Awadalla et al.
-
BLIP3-KALE: Knowledge Augmented Large-scale Dense Captions
(2024)
β’ No Venue
Awadalla et al.
-
Revisiting In-context Learning With Long Context Language Models
(2024)
β’ No Venue
Baek et al.
-
Screenai: A Vision-language Model For UI And Infographics Understanding
(2024)
β’ No Venue
Baechler et al.
-
Longwriter: Unleashing 10,000+ Word Generation From Long Context Llms
(2024)
β’ No Venue
Bai et al.
-
Fintral: A Family Of GPT-4 Level Multimodal Financial Large Language Models
(2024)
β’ No Venue
Bhatia et al.
-
INDUS: Effective And Efficient Language Models For Scientific Applications
(2024)
β’ No Venue
Bhattacharjee et al.
-
Visual Riddles: A Commonsense And World Knowledge Challenge For Large Vision And Language Models
(2024)
β’ No Venue
Bitton-Guetta et al.
-
Merlin: A Vision Language Foundation Model For 3D Computed Tomography
(2024)
β’ Arxiv
β’ 45 citations
Blankemeier et al.
-
3dgraphllm: Combining Semantic Graphs And Large Language Models For 3D Scene Understanding
(2024)
β’ No Venue
Tatiana Zemskova, Dmitry Yudin
-
Long Code Arena: A Set Of Benchmarks For Long-context Code Models
(2024)
β’ No Venue
Bogomolov et al.
-
Transformers Meet Neural Algorithmic Reasoners
(2024)
β’ No Venue
Bounsi et al.
-
On The Compositional Generalization Of Multimodal Llms For Medical Imaging
(2024)
β’ No Venue
Cai et al.
-
Matryoshka Multimodal Models
(2024)
β’ No Venue
Cai et al.
-
Edgefusion: On-device Text-to-image Generation
(2024)
β’ No Venue
Castells et al.
-
PERSONA: A Reproducible Testbed For Pluralistic Alignment
(2024)
β’ No Venue
Castricato et al.
-
Swe-bench-java: A Github Issue Resolving Benchmark For Java
(2024)
β’ No Venue
Zan et al.
-
Pangea: A Fully Open Multilingual Multimodal LLM For 39 Languages
(2024)
β’ No Venue
Yue et al.
-
Getting It Right: Improving Spatial Consistency In Text-to-image Models
(2024)
β’ No Venue
Chatterjee et al.
-
Tx-llm: A Large Language Model For Therapeutics
(2024)
β’ No Venue
Chaves et al.
-
Premise Order Matters In Reasoning With Large Language Models
(2024)
β’ No Venue
Chen et al.
-
Chexagent: Towards A Foundation Model For Chest X-ray Interpretation
(2024)
β’ No Venue
Chen et al.
-
Compcap: Improving Multimodal Large Language Models With Composite Captions
(2024)
β’ No Venue
Chen et al.
-
Gmai-mmbench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI
(2024)
β’ No Venue
Chen et al.
-
Hallucination Detection: Robustly Discerning Reliable Answers In Large Language Models
(2024)
β’ CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management
β’ 59 citations
Chen et al.
-
How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites
(2024)
β’ No Venue
Chen et al.
-
Language Models Are Hidden Reasoners: Unlocking Latent Reasoning Capabilities Via Self-rewarding
(2024)
β’ No Venue
Chen et al.
-
Interleaved Scene Graph For Interleaved Text-and-image Generation Assessment
(2024)
β’ No Venue
Chen et al.
-
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
(2024)
β’ No Venue
Chen et al.
-
Motionllm: Understanding Human Behaviors From Human Motions And Videos
(2024)
β’ No Venue
Chen et al.
-
MS MARCO Web Search: A Large-scale Information-rich Web Dataset With Millions Of Real Click Labels
(2024)
β’ No Venue
Chen et al.
-
Panda-70m: Captioning 70M Videos With Multiple Cross-modality Teachers
(2024)
β’ No Venue
Chen et al.
-
Reverse Thinking Makes Llms Stronger Reasoners
(2024)
β’ No Venue
Chen et al.
-
Self-play Fine-tuning Converts Weak Language Models To Strong Language Models
(2024)
β’ No Venue
Chen et al.
-
Visionts: Visual Masked Autoencoders Are Free-lunch Zero-shot Time Series Forecasters
(2024)
β’ No Venue
Chen et al.
-
Mmmu-pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
(2024)
β’ No Venue
Yue et al.
-
Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction
(2024)
β’ Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
β’ 51 citations
Yuan et al.
-
Videorefer Suite: Advancing Spatial-temporal Object Understanding With Video LLM
(2024)
β’ No Venue
Yuan et al.
-
Chronomagic-bench: A Benchmark For Metamorphic Evaluation Of Text-to-time-lapse Video Generation
(2024)
β’ No Venue
Yuan et al.
-
Open-vocabulary SAM: Segment And Recognize Twenty-thousand Classes Interactively
(2024)
β’ No Venue
Yuan et al.
-
Magictime: Time-lapse Video Generation Models As Metamorphic Simulators
(2024)
β’ No Venue
Yuan et al.
-
MMAU: A Holistic Benchmark Of Agent Capabilities Across Diverse Domains
(2024)
β’ No Venue
Yin et al.
-
M3docrag: Multi-modal Retrieval Is What You Need For Multi-page Multi-document Understanding
(2024)
β’ No Venue
Cho et al.
-
M-longdoc: A Benchmark For Multimodal Super-long Document Understanding And A Retrieval-aware Tuning Framework
(2024)
β’ No Venue
Chia et al.
-
A Flexible Large Language Models Guardrail Development Methodology Applied To Off-topic Prompt Detection
(2024)
β’ No Venue
Gabriel Chua, Shing Yee Chan, Shaun Khoo
-
Heavy Labels Out! Dataset Distillation With Label Space Lightening
(2024)
β’ No Venue
Yu et al.
-
Toto: Time Series Optimized Transformer For Observability
(2024)
β’ No Venue
Cohen et al.
-
Saullm-54b & Saullm-141b: Scaling Up Domain Adaptation For The Legal Domain
(2024)
β’ No Venue
Colombo et al.
-
Towards A Personal Health Large Language Model
(2024)
β’ No Venue
Cosentino et al.
-
NVLM: Open Frontier-class Multimodal Llms
(2024)
β’ No Venue
Dai et al.
-
Molmo And Pixmo: Open Weights And Open Data For State-of-the-art Multimodal Models
(2024)
β’ No Venue
Deitke et al.
-
Coconut: Modernizing COCO Segmentation
(2024)
β’ No Venue
Deng et al.
-
Mapeval: A Map-based Evaluation Of Geo-spatial Reasoning In Foundation Models
(2024)
β’ No Venue
Dihan et al.
-
Ferret-ui: Grounded Mobile UI Understanding With Multimodal Llms
(2024)
β’ No Venue
You et al.
-
Unleashing Reasoning Capability Of Llms Via Scalable Question Synthesis From Scratch
(2024)
β’ No Venue
Ding et al.
-
Megapairs: Massive Data Synthesis For Universal Multimodal Retrieval
(2024)
β’ No Venue
Zhou et al.
-
Vintern-1b: An Efficient Multimodal Large Language Model For Vietnamese
(2024)
β’ No Venue
Doan et al.
-
Baichuanseed: Sharing The Potential Of Extensive Data Collection And Deduplication By Introducing A Competitive Large Language Model Baseline
(2024)
β’ No Venue
Dong et al.
-
Charxiv: Charting Gaps In Realistic Chart Understanding In Multimodal Llms
(2024)
β’ No Venue
Wang et al.
-
Git: Towards Generalist Vision Transformer Through Universal Language Interface
(2024)
β’ No Venue
Wang et al.
-
Multilingual E5 Text Embeddings: A Technical Report
(2024)
β’ No Venue
Wang et al.
-
Mtu-bench: A Multi-granularity Tool-use Benchmark For Large Language Models
(2024)
β’ No Venue
Wang et al.
-
Octo: An Open-source Generalist Robot Policy
(2024)
β’ No Venue
Team et al.
-
Mmlu-pro: A More Robust And Challenging Multi-task Language Understanding Benchmark
(2024)
β’ No Venue
Wang et al.
-
Helpsteer2-preference: Complementing Ratings With Preferences
(2024)
β’ No Venue
Wang et al.
-
Grutopia: Dream General Robots In A City At Scale
(2024)
β’ No Venue
Wang et al.
-
Lift: Leveraging Human Feedback For Text-to-video Model Alignment
(2024)
β’ No Venue
Wang et al.
-
How Do Your Code Llms Perform? Empowering Code Instruction Tuning With High-quality Data
(2024)
β’ No Venue
Wang et al.
-
Litesearch: Efficacious Tree Search For LLM
(2024)
β’ No Venue
Wang et al.
-
Structlm: Towards Building Generalist Models For Structured Knowledge Grounding
(2024)
β’ No Venue
Zhuang et al.
-
EVA-CLIP-18B: Scaling CLIP To 18 Billion Parameters
(2024)
β’ No Venue
Sun et al.
-
LAMBDA: A Large Model Based Data Agent
(2024)
β’ No Venue
Sun et al.
-
T2v-compbench: A Comprehensive Benchmark For Compositional Text-to-video Generation
(2024)
β’ No Venue
Sun et al.
-
Parrot: Multilingual Visual Instruction Tuning
(2024)
β’ No Venue
Sun et al.
-
Planetarium: A Rigorous Benchmark For Translating Text To Structured Planning Languages
(2024)
β’ No Venue
Zuo et al.
-
Video-star: Self-training Enables Video Instruction Tuning With Any Supervision
(2024)
β’ No Venue
Zohar et al.
-
Llava-3d: A Simple Yet Effective Pathway To Empowering Lmms With 3d-awareness
(2024)
β’ No Venue
Zhu et al.
-
Yolov9: Learning What You Want To Learn Using Programmable Gradient Information
(2024)
β’ No Venue
Chien-Yao Wang, I-Hau Yeh, Hong-Yuan Mark Liao
-
Diasynth -- Synthetic Dialogue Generation Framework
(2024)
β’ No Venue
Suresh et al.
-
Videogamebunny: Towards Vision Assistants For Video Games
(2024)
β’ No Venue
Mohammad Reza Taesiri, Cor-Paul Bezemer
-
TIP-I2V: A Million-scale Real Text And Image Prompt Dataset For Image-to-video Generation
(2024)
β’ No Venue
Wenhao Wang, Yi Yang
-
Judgebench: A Benchmark For Evaluating Llm-based Judges
(2024)
β’ No Venue
Tan et al.
-
Omnieval: An Omnidirectional And Automatic RAG Evaluation Benchmark In Financial Domain
(2024)
β’ No Venue
Wang et al.
-
PIN: A Knowledge-intensive Dataset For Paired And Interleaved Multimodal Documents
(2024)
β’ No Venue
Wang et al.
-
Textsquare: Scaling Up Text-centric Visual Instruction Tuning
(2024)
β’ No Venue
Tang et al.
-
Ominicontrol: Minimal And Universal Control For Diffusion Transformer
(2024)
β’ No Venue
Tan et al.
-
Ref-avs: Refer And Segment Objects In Audio-visual Scenes
(2024)
β’ No Venue
Wang et al.
-
Grandmaster-level Chess Without Search
(2024)
β’ No Venue
Ruoss et al.
-
Atlas-chat: Adapting Large Language Models For Low-resource Moroccan Arabic Dialect
(2024)
β’ No Venue
Shang et al.
-
MMAU: A Massive Multi-task Audio Understanding And Reasoning Benchmark
(2024)
β’ No Venue
Sakshi et al.
-
Blended RAG: Improving RAG (retriever-augmented Generation) Accuracy With Semantic Search And Hybrid Query-based Retrievers
(2024)
β’ 2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR)
β’ 47 citations
Kunal Sawarkar, Abhilasha Mangal, Shivam Raj Solanki
-
Truth Or Mirage? Towards End-to-end Factuality Evaluation With LLM-OASIS
(2024)
β’ No Venue
Scirè et al.
-
Livexiv -- A Multi-modal Live Benchmark Based On Arxiv Papers Content
(2024)
β’ No Venue
Shabtay et al.
-
Synth^2: Boosting Visual-language Models With Synthetic Captions And Image Embeddings
(2024)
β’ No Venue
Sharifzadeh et al.
-
Jetmoe: Reaching Llama2 Performance With 0.1M Dollars
(2024)
β’ No Venue
Shen et al.
-
Aya Model: An Instruction Finetuned Open-access Multilingual Language Model
(2024)
β’ No Venue
ΓstΓΌn et al.
-
PERL: Parameter Efficient Reinforcement Learning From Human Feedback
(2024)
β’ No Venue
Sidahmed et al.
-
Can Large Language Models Understand Context?
(2024)
β’ No Venue
Zhu et al.
-
Aya Dataset: An Open-access Collection For Multilingual Instruction Tuning
(2024)
β’ No Venue
Singh et al.
-
MARVEL-40M+: Multi-level Visual Elaboration For High-fidelity Text-to-3d Content Creation
(2024)
β’ No Venue
Sinha et al.
-
Global MMLU: Understanding And Addressing Cultural And Linguistic Biases In Multilingual Evaluation
(2024)
β’ No Venue
Singh et al.
-
A Large Encoder-decoder Family Of Foundation Models For Chemical Language
(2024)
β’ No Venue
Soares et al.
-
The Russian-focused Embedders' Exploration: Rumteb Benchmark And Russian Embedding Model Design
(2024)
β’ No Venue
Snegirev et al.
-
Dolma: An Open Corpus Of Three Trillion Tokens For Language Model Pretraining Research
(2024)
β’ No Venue
Soldaini et al.
-
Both Text And Images Leaked! A Systematic Analysis Of Multimodal LLM Data Contamination
(2024)
β’ No Venue
Song et al.
-
Moviellm: Enhancing Long Video Understanding With Ai-generated Movies
(2024)
β’ No Venue
Song et al.
-
To Cot Or Not To Cot? Chain-of-thought Helps Mainly On Math And Symbolic Reasoning
(2024)
β’ No Venue
Sprague et al.
-
Canttalkaboutthis: Aligning Language Models To Stay On Topic In Dialogues
(2024)
β’ No Venue
Sreedhar et al.
-
Aligning Teacher With Student Preferences For Tailored Training Data Generation
(2024)
β’ No Venue
Liu et al.
-
Best Practices And Lessons Learned On Synthetic Data For Language Models
(2024)
β’ No Venue
Liu et al.
-
Apigen: Automated Pipeline For Generating Verifiable And Diverse Function-calling Datasets
(2024)
β’ No Venue
Liu et al.
-
DDK: Distilling Domain Knowledge For Efficient Large Language Models
(2024)
β’ No Venue
Liu et al.
-
Glyph-byt5-v2: A Strong Aesthetic Baseline For Accurate Multilingual Visual Text Rendering
(2024)
β’ No Venue
Liu et al.
-
Harnessing Webpage Uis For Text-rich Visual Understanding
(2024)
β’ No Venue
Liu et al.
-
Longgenbench: Long-context Generation Benchmark
(2024)
β’ No Venue
Liu et al.
-
MIA-DPO: Multi-image Augmented Direct Preference Optimization For Large Vision-language Models
(2024)
β’ No Venue
Liu et al.
-
POINTS1.5: Building A Vision-language Model Towards Real World Applications
(2024)
β’ No Venue
Liu et al.
-
POINTS: Improving Your Vision-language Model With Affordable Strategies
(2024)
β’ No Venue
Liu et al.
-
Cambrian-1: A Fully Open, Vision-centric Exploration Of Multimodal Llms
(2024)
β’ No Venue
Tong et al.
-
Skywork-reward: Bag Of Tricks For Reward Modeling In Llms
(2024)
β’ No Venue
Liu et al.
-
Teach Multimodal Llms To Comprehend Electrocardiographic Images
(2024)
β’ No Venue
Liu et al.
-
Spatial-temporal Large Language Model For Traffic Prediction
(2024)
β’ 2024 25th IEEE International Conference on Mobile Data Management (MDM)
β’ 56 citations
Liu et al.
-
World Model On Million-length Video And Language With Ringattention
(2024)
β’ No Venue
Liu et al.
-
RULE: Reliable Multimodal RAG For Factuality In Medical Vision Language Models
(2024)
β’ No Venue
Xia et al.
-
Video Instruction Tuning With Synthetic Data
(2024)
β’ No Venue
Zhang et al.
-
MAVIS: Mathematical Visual Instruction Tuning
(2024)
β’ No Venue
Zhang et al.
-
Mme-realworld: Could Your Multimodal LLM Challenge High-resolution Real-world Scenarios That Are Difficult For Humans?
(2024)
β’ No Venue
Zhang et al.
-
Agentgym: Evolving Large Language Model-based Agents Across Diverse Environments
(2024)
β’ No Venue
Xi et al.
-
Multimodal Self-instruct: Synthetic Abstract Image And Visual Reasoning Instruction Using Language Model
(2024)
β’ No Venue
Zhang et al.
-
SPAR: Personalized Content-based Recommendation Via Long Engagement Attention
(2024)
β’ No Venue
Zhang et al.
-
Seacrowd: A Multilingual Multimodal Data Hub And Benchmark Suite For Southeast Asian Languages
(2024)
β’ No Venue
Lovenia et al.
-
Starcoder 2 And The Stack V2: The Next Generation
(2024)
β’ No Venue
Lozhkov et al.
-
Large Language Models Are Superpositions Of All Characters: Attaining Arbitrary Role-play Via Self-alignment
(2024)
β’ No Venue
Lu et al.
-
Generative World Explorer
(2024)
β’ No Venue
Lu et al.
-
Mathverse: Does Your Multi-modal LLM Truly See The Diagrams In Visual Math Problems?
(2024)
β’ No Venue
Zhang et al.
-
Omniparser For Pure Vision Based GUI Agent
(2024)
β’ No Venue
Lu et al.
-
Mathcoder2: Better Math Reasoning From Continued Pretraining On Model-translated Mathematical Code
(2024)
β’ No Venue
Lu et al.
-
Robustft: Robust Supervised Fine-tuning For Large Language Models Under Noisy Response
(2024)
β’ No Venue
Luo et al.
-
Mmevol: Empowering Multimodal Large Language Models With Evol-instruct
(2024)
β’ No Venue
Luo et al.
-
Weblinx: Real-world Website Navigation With Multi-turn Dialogue
(2024)
β’ No Venue
Xing Han LΓΉ, ZdenΔk Kasner, Siva Reddy
-
Reft: Reasoning With Reinforced Fine-tuning
(2024)
β’ No Venue
Luong et al.
-
Aria Everyday Activities Dataset
(2024)
β’ No Venue
Lv et al.
-
Diffsensei: Bridging Multi-modal Llms And Diffusion Models For Customized Manga Generation
(2024)
β’ No Venue
Wu et al.
-
Plot2code: A Comprehensive Benchmark For Evaluating Multi-modal Large Language Models In Code Generation From Scientific Plots
(2024)
β’ No Venue
Wu et al.
-
Foundation Models For Music: A Survey
(2024)
β’ No Venue
Ma et al.
-
Futga: Towards Fine-grained Music Understanding Through Temporally-enhanced Generative Augmentation
(2024)
β’ No Venue
Wu et al.
-
Fiva: Fine-grained Visual Attribute Dataset For Text-to-image Diffusion Models
(2024)
β’ No Venue
Wu et al.
-
Wildchat: 1M Chatgpt Interaction Logs In The Wild
(2024)
β’ No Venue
Zhao et al.
-
Eurollm: Multilingual Language Models For Europe
(2024)
β’ No Venue
Martins et al.
-
Improving Text-to-image Consistency Via Automatic Prompt Optimization
(2024)
β’ No Venue
MaΓ±as et al.
-
Openelm: An Efficient Language Model Family With Open-source Training And Inference Framework
(2024)
β’ No Venue
Mehta et al.
-
Videoglamm: A Large Multimodal Model For Pixel-level Visual Grounding In Videos
(2024)
β’ No Venue
Munasinghe et al.
-
A Pointer Network-based Approach For Joint Extraction And Detection Of Multi-label Multi-class Intents
(2024)
β’ No Venue
Mullick et al.
-
Yesbut: A High-quality Annotated Multimodal Dataset For Evaluating Satire Comprehension Capability Of Vision-language Models
(2024)
β’ No Venue
Nandy et al.
-
Openvid-1m: A Large-scale High-quality Dataset For Text-to-video Generation
(2024)
β’ No Venue
Nan et al.
-
Preference Tuning With Human Feedback On Language, Speech, And Vision Tasks: A Survey
(2024)
β’ No Venue
Winata et al.
-
A Survey Of Small Language Models
(2024)
β’ No Venue
Nguyen et al.
-
User-llm: Efficient LLM Contextualization With User Embeddings
(2024)
β’ No Venue
Ning et al.
-
Xland-100b: A Large-scale Multi-task Dataset For In-context Reinforcement Learning
(2024)
β’ No Venue
Nikulin et al.
-
Llms Know More Than They Show: On The Intrinsic Representation Of LLM Hallucinations
(2024)
β’ No Venue
Orgad et al.
-
Omnidocbench: Benchmarking Diverse PDF Document Parsing With Comprehensive Annotations
(2024)
β’ No Venue
Ouyang et al.
-
Worldcuisines: A Massive-scale Benchmark For Multilingual And Multicultural Visual Question Answering On Global Cuisines
(2024)
β’ No Venue
Winata et al.
-
Training Software Engineering Agents And Verifiers With Swe-gym
(2024)
β’ No Venue
Pan et al.
-
Llmlingua-2: Data Distillation For Efficient And Faithful Task-agnostic Prompt Compression
(2024)
β’ No Venue
Pan et al.
-
IOPO: Empowering Llms With Complex Instruction Following Via Input-output Preference Optimization
(2024)
β’ No Venue
Zhang et al.
-
Datadreamer: A Tool For Synthetic Data Generation And Reproducible LLM Workflows
(2024)
β’ No Venue
Ajay Patel, Colin Raffel, Chris Callison-Burch
-
Survey Of Cultural Awareness In Language Models: Text And Beyond
(2024)
β’ No Venue
Pawar et al.
-
Dreambench++: A Human-aligned Benchmark For Personalized Image Generation
(2024)
β’ No Venue
Peng et al.
-
Large Language Model Confidence Estimation Via Black-box Access
(2024)
β’ No Venue
Pedapati et al.
-
The Fineweb Datasets: Decanting The Web For The Finest Text Data At Scale
(2024)
β’ No Venue
Penedo et al.
-
Livebench: A Challenging, Contamination-free LLM Benchmark
(2024)
β’ No Venue
White et al.
-
A Toolbox For Surfacing Health Equity Harms And Biases In Large Language Models
(2024)
β’ Nature Medicine
β’ 46 citations
Pfohl et al.
-
We-math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?
(2024)
β’ No Venue
Qiao et al.
-
Evaluating D-MERIT Of Partial-annotation On Information Retrieval
(2024)
β’ No Venue
Rassin et al.
-
Adapting Safe-for-work Classifier For Malaysian Language Text: Enhancing Alignment In Llm-ops Framework
(2024)
β’ No Venue
Razak et al.
-
Redpajama: An Open Dataset For Training Large Language Models
(2024)
β’ No Venue
Weber et al.
-
VISTA: Enhancing Long-duration And High-resolution Video Understanding By Video Spatiotemporal Augmentation
(2024)
β’ No Venue
Ren et al.
-
Omniedit: Building Image Editing Generalist Models Through Specialist Supervision
(2024)
β’ No Venue
Wei et al.
-
Urbench: A Comprehensive Benchmark For Evaluating Large Multimodal Models In Multi-view Urban Scenarios
(2024)
β’ No Venue
Zhou et al.
-
Paint By Inpaint: Learning To Add Image Objects By Removing Them First
(2024)
β’ No Venue
Wasserman et al.
-
RLHF Workflow: From Reward Modeling To Online RLHF
(2024)
β’ No Venue
Dong et al.
-
Toward General Instruction-following Alignment For Retrieval-augmented Generation
(2024)
β’ No Venue
Dong et al.
-
CLEAR: Character Unlearning In Textual And Visual Modalities
(2024)
β’ No Venue
Dontsov et al.
-
Hyperclova X Technical Report
(2024)
β’ No Venue
Yoo et al.
-
An Interactive Agent Foundation Model
(2024)
β’ No Venue
Durante et al.
-
Learning To Move Like Professional Counter-strike Players
(2024)
β’ No Venue
Durst et al.
-
Processbench: Identifying Process Errors In Mathematical Reasoning
(2024)
β’ No Venue
Zheng et al.
-
Chemllm: A Chemical Large Language Model
(2024)
β’ No Venue
Zhang et al.
-
CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark
(2024)
β’ No Venue
Zhang et al.
-
Croissantllm: A Truly Bilingual French-english Language Model
(2024)
β’ No Venue
Faysse et al.
-
Test Of Time: A Benchmark For Evaluating Llms On Temporal Reasoning
(2024)
β’ No Venue
Fatemi et al.
-
Enhancing Video-language Representations With Structural Spatio-temporal Alignment
(2024)
β’ IEEE Transactions on Pattern Analysis and Machine Intelligence
β’ 49 citations
Fei et al.
-
Openfedllm: Training Large Language Models On Decentralized Private Data Via Federated Learning
(2024)
β’ KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
β’ 47 citations
Ye et al.
-
RAG Foundry: A Framework For Enhancing Llms For Retrieval Augmented Generation
(2024)
β’ No Venue
Fleischer et al.
-
Video-mme: The First-ever Comprehensive Evaluation Benchmark Of Multi-modal Llms In Video Analysis
(2024)
β’ No Venue
Fu et al.
-
LOKI: A Comprehensive Synthetic Data Detection Benchmark Using Large Multimodal Models
(2024)
β’ No Venue
Ye et al.
-
Mm-ego: Towards Building Egocentric Multimodal Llms
(2024)
β’ No Venue
Ye et al.
-
Omni-math: A Universal Olympiad Level Mathematic Benchmark For Large Language Models
(2024)
β’ No Venue
Gao et al.
-
Dreamreward: Text-to-3d Generation With Human Preference
(2024)
β’ No Venue
Ye et al.
-
Longins: A Challenging Long-context Instruction-based Exam For Llms
(2024)
β’ No Venue
Gavin et al.
-
Kvasir-vqa: A Text-image Pair GI Tract Dataset
(2024)
β’ No Venue
Gautam et al.
-
Are We Done With MMLU?
(2024)
β’ No Venue
Gema et al.
-
Socially Aware Synthetic Data Generation For Suicidal Ideation Detection Using Large Language Models
(2024)
β’ IEEE Access
β’ 40 citations
Hamideh Ghanadian, Isar Nejadgholi, Hussein Al Osman
-
Learn Your Reference Model For Real Good Alignment
(2024)
β’ No Venue
Gorbatovski et al.
-
Zamba: A Compact 7B SSM Hybrid Model
(2024)
β’ No Venue
Glorioso et al.
-
Mulberry: Empowering MLLM With O1-like Reasoning And Reflection Via Collective Monte Carlo Tree Search
(2024)
β’ No Venue
Yao et al.
-
Atomovideo: High Fidelity Image-to-video Generation
(2024)
β’ No Venue
Gong et al.
-
Av-odyssey Bench: Can Your Multimodal Llms Really Understand Audio-visual Information?
(2024)
β’ No Venue
Gong et al.
-
Navigating The Digital World As Humans Do: Universal Visual Grounding For GUI Agents
(2024)
β’ No Venue
Gou et al.
-
Olmo: Accelerating The Science Of Language Models
(2024)
β’ No Venue
Groeneveld et al.
-
Sam2point: Segment Any 3D As Videos In Zero-shot And Promptable Manners
(2024)
β’ No Venue
Guo et al.
-
Mammoth-vl: Eliciting Multimodal Reasoning With Instruction Tuning At Scale
(2024)
β’ No Venue
Guo et al.
-
Direct Language Model Alignment From Online AI Feedback
(2024)
β’ No Venue
Guo et al.
-
Infimm-webmath-40b: Advancing Multimodal Pre-training For Enhanced Mathematical Reasoning
(2024)
β’ No Venue
Han et al.
-
Vision-language Models For Medical Report Generation And Visual Question Answering: A Review
(2024)
β’ Frontiers in Artificial Intelligence
β’ 86 citations
Iryna Hartsock, Ghulam Rasool
-
Data Mixture Inference: What Do BPE Tokenizers Reveal About Their Training Data?
(2024)
β’ No Venue
Hayase et al.
-
Distill Visual Chart Reasoning Ability From Llms To Mllms
(2024)
β’ No Venue
He et al.
-
Cameractrl: Enabling Camera Control For Text-to-video Generation
(2024)
β’ No Venue
He et al.
-
Mmworld: Towards Multi-discipline Multi-faceted World Model Evaluation In Videos
(2024)
β’ No Venue
He et al.
-
UCFE: A User-centric Financial Expertise Benchmark For Large Language Models
(2024)
β’ No Venue
Yang et al.
-
Vript: A Video Is Worth Thousands Of Words
(2024)
β’ No Venue
Yang et al.
-
Thinking In Space: How Multimodal Large Language Models See, Remember, And Recall Spaces
(2024)
β’ No Venue
Yang et al.
-
CRAG -- Comprehensive RAG Benchmark
(2024)
β’ No Venue
Yang et al.
-
Sampart3d: Segment Any Part In 3D Objects
(2024)
β’ No Venue
Yang et al.
-
3D-GRAND: A Million-scale Dataset For 3d-llms With Better Grounding And Less Hallucination
(2024)
β’ No Venue
Yang et al.
-
Mplug-docowl 1.5: Unified Structure Learning For Ocr-free Document Understanding
(2024)
β’ No Venue
Hu et al.
-
Compression Represents Intelligence Linearly
(2024)
β’ No Venue
Huang et al.
-
Can Knowledge Editing Really Correct Hallucinations?
(2024)
β’ No Venue
Huang et al.
-
How Good Are Low-bit Quantized Llama3 Models? An Empirical Study
(2024)
β’ No Venue
Huang et al.
-
RU-AI: A Large Multimodal Dataset For Machine-generated Content Detection
(2024)
β’ Arxiv
β’ 2046 citations
Huang et al.
-
Simple And Scalable Strategies To Continually Pre-train Large Language Models
(2024)
β’ No Venue
Ibrahim et al.
-
Gitchameleon: Unmasking The Version-switching Capabilities Of Code Generation Models
(2024)
β’ No Venue
Islah et al.
-
Improving Medical Reasoning Through Retrieval And Self-reflection With Retrieval-augmented Large Language Models
(2024)
β’ Bioinformatics
β’ 50 citations
Jeong et al.
-
LEOPARD : A Vision Language Model For Text-rich Multi-image Tasks
(2024)
β’ No Venue
Jia et al.
-
Many-shot In-context Learning In Multimodal Foundation Models
(2024)
β’ No Venue
Jiang et al.
-
SOLAMI: Social Vision-language-action Modeling For Immersive Interaction With 3D Autonomous Characters
(2024)
β’ No Venue
Jiang et al.
-
RATIONALYST: Pre-training Process-supervision For Improving Reasoning
(2024)
β’ No Venue
Jiang et al.
-
Dsbench: How Far Are Data Science Agents To Becoming Data Science Experts?
(2024)
β’ No Venue
Jing et al.
-
Accessing GPT-4 Level Mathematical Olympiad Solutions Via Monte Carlo Tree Self-refine With Llama-3 8B
(2024)
β’ No Venue
Zhang et al.
-
VARCO-VISION: Expanding Frontiers In Korean Vision-language Models
(2024)
β’ No Venue
Ju et al.
-
Omniact: A Dataset And Benchmark For Enabling Multimodal Generalist Autonomous Agents For Desktop And Web
(2024)
β’ No Venue
Kapoor et al.
-
Vineppo: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
(2024)
β’ No Venue
Kazemnejad et al.
-
ATHAR: A High-quality And Diverse Dataset For Classical Arabic To English Translation
(2024)
β’ No Venue
Mohammed Khalil, Mohammed Sabry
-
Sdpo: Don't Use Your Data All At Once
(2024)
β’ No Venue
Kim et al.
-
Evaluating Language Models As Synthetic Data Generators
(2024)
β’ No Venue
Kim et al.
-
Husky: A Unified, Open-source Language Agent For Multi-step Reasoning
(2024)
β’ No Venue
Kim et al.
-
Xgen-mm (BLIP-3): A Family Of Open Large Multimodal Models
(2024)
β’ No Venue
Xue et al.
-
Longvila: Scaling Long-context Visual Language Models For Long Videos
(2024)
β’ No Venue
Xue et al.
-
Fact, Fetch, And Reason: A Unified Evaluation Of Retrieval-augmented Generation
(2024)
β’ No Venue
Krishna et al.
-
Harvesting Textual And Structured Data From The HAL Publication Repository
(2024)
β’ No Venue
Kulumba et al.
-
Biomistral: A Collection Of Open-source Pretrained Large Language Models For Medical Domains
(2024)
β’ Findings of the Association for Computational Linguistics ACL 2024
β’ 108 citations
Labrak et al.
-
TΓLU 3: Pushing Frontiers In Open Language Model Post-training
(2024)
β’ No Venue
Lambert et al.
-
Rewardbench: Evaluating Reward Models For Language Modeling
(2024)
β’ No Venue
Lambert et al.
-
Building And Better Understanding Vision-language Models: Insights And Future Directions
(2024)
β’ No Venue
LaurenΓ§on et al.
-
Unlocking The Conversion Of Web Screenshots Into HTML Code With The Websight Dataset
(2024)
β’ No Venue
Hugo LaurenΓ§on, LΓ©o Tronchon, Victor Sanh
-
What Matters When Building Vision-language Models?
(2024)
β’ No Venue
LaurenΓ§on et al.
-
Closing The Gap Between Open-source And Commercial Large Language Models For Medical Evidence Summarization
(2024)
β’ npj Digital Medicine
β’ 45 citations
Zhang et al.
-
Thanos: Enhancing Conversational Agents With Skill-of-mind-infused Large Language Model
(2024)
β’ No Venue
Lee et al.
-
Meteor: Mamba-based Traversal Of Rationale For Large Language And Vision Models
(2024)
β’ No Venue
Lee et al.
-
Stark: Social Long-term Multi-modal Conversation With Persona Commonsense Knowledge
(2024)
β’ No Venue
Lee et al.
-
A Careful Examination Of Large Language Model Performance On Grade School Arithmetic
(2024)
β’ No Venue
Zhang et al.
-
Ootdiffusion: Outfitting Fusion Based Latent Diffusion For Controllable Virtual Try-on
(2024)
β’ No Venue
Xu et al.
-
Stronger Models Are NOT Stronger Teachers For Instruction Tuning
(2024)
β’ No Venue
Xu et al.
-
Slowfast-llava: A Strong Training-free Baseline For Video Large Language Models
(2024)
β’ No Venue
Xu et al.
-
Long-context Llms Struggle With Long In-context Learning
(2024)
β’ No Venue
Li et al.
-
Llava-next-interleave: Tackling Multi-image, Video, And 3D In Large Multimodal Models
(2024)
β’ No Venue
Li et al.
-
LAION-SG: An Enhanced Large-scale Dataset For Training Complex Image-text Models With Structural Annotations
(2024)
β’ No Venue
Li et al.
-
Direct Preference Knowledge Distillation For Large Language Models
(2024)
β’ No Venue
Li et al.
-
Datacomp-lm: In Search Of The Next Generation Of Training Sets For Language Models
(2024)
β’ No Venue
Li et al.
-
Codes: Towards Building Open-source Language Models For Text-to-sql
(2024)
β’ Proceedings of the ACM on Management of Data
β’ 44 citations
Li et al.
-
Dotamath: Decomposition Of Thought With Code Assistance And Self-correction For Mathematical Reasoning
(2024)
β’ No Venue
Li et al.
-
GMAI-VL & GMAI-VL-5.5M: A Large Vision-language Model And A Comprehensive Multimodal Dataset Towards General Medical AI
(2024)
β’ No Venue
Li et al.
-
Androidlab: Training And Systematic Benchmarking Of Android Autonomous Agents
(2024)
β’ No Venue
Xu et al.
-
Omnicorpus: A Unified Multimodal Corpus Of 10 Billion-level Images Interleaved With Text
(2024)
β’ No Venue
Li et al.
-
Omnibench: Towards The Future Of Universal Omni-language Models
(2024)
β’ No Venue
Li et al.
-
Scaling (down) CLIP: A Comprehensive Analysis Of Data, Architecture, And Training Strategies
(2024)
β’ No Venue
Zichao Li, Cihang Xie, Ekin Dogus Cubuk
-
Synthetic Data (almost) From Scratch: Generalized Instruction Tuning For Language Models
(2024)
β’ No Venue
Li et al.
-
Urbangpt: Spatio-temporal Large Language Models
(2024)
β’ KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
β’ 69 citations
Li et al.
-
Wolf: Captioning Everything With A World Summarization Framework
(2024)
β’ No Venue
Li et al.
-
Chatglm-math: Improving Math Problem-solving In Large Language Models With A Self-critique Pipeline
(2024)
β’ No Venue
Xu et al.
-
Magpie: Alignment Data Synthesis From Scratch By Prompting Aligned Llms With Nothing
(2024)
β’ No Venue
Xu et al.
-
Contrastive Preference Optimization: Pushing The Boundaries Of LLM Performance In Machine Translation
(2024)
β’ No Venue
Xu et al.
-
Earthgpt: A Universal Multi-modal Large Language Model For Multi-sensor Image Comprehension In Remote Sensing Domain
(2024)
β’ IEEE Transactions on Geoscience and Remote Sensing
β’ 78 citations
Zhang et al.
-
MMIE: Massive Multimodal Interleaved Comprehension Benchmark For Large Vision-language Models
(2024)
β’ No Venue
Xia et al.
-
Document Parsing Unveiled: Techniques, Challenges, And Prospects For Structured Information Extraction
(2024)
β’ No Venue
Zhang et al.
-
Benchmarking Retrieval-augmented Generation For Medicine
(2024)
β’ Findings of the Association for Computational Linguistics ACL 2024
β’ 119 citations
Xiong et al.
-
I-SHEEP: Self-alignment Of LLM From Scratch Through An Iterative Self-enhancement Paradigm
(2024)
β’ No Venue
Liang et al.
-
Large Motion Video Autoencoding With Cross-modal Video VAE
(2024)
β’ No Venue
Xing et al.
-
HARE: Human Priors, A Key To Small Language Model Efficiency
(2024)
β’ No Venue
Zhang et al.
-
Showui: One Vision-language-action Model For GUI Visual Agent
(2024)
β’ No Venue
Lin et al.
-
Medtrinity-25m: A Large-scale Multimodal Dataset With Multigranular Annotations For Medicine
(2024)
β’ No Venue
Xie et al.
-
A Preliminary Study Of O1 In Medicine: Are We Closer To An AI Doctor?
(2024)
β’ No Venue
Xie et al.
-
Open-finllms: Open Multimodal Large Language Models For Financial Applications
(2024)
β’ No Venue
Xie et al.
-
The Finben: An Holistic Financial Benchmark For Large Language Models
(2024)
β’ No Venue
Xie et al.
-
Fine-tuning Or Retrieval? Comparing Knowledge Injection In Llms
(2023)
β’ Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
β’ 50 citations
Ovadia et al.
-
Chartgpt: Leveraging Llms To Generate Charts From Abstract Natural Language
(2023)
β’ IEEE Transactions on Visualization and Computer Graphics
β’ 48 citations
Tian et al.
-
The Refinedweb Dataset For Falcon LLM: Outperforming Curated Corpora With Web Data, And Web Data Only
(2023)
β’ No Venue
Penedo et al.
-
Evaluating The Logical Reasoning Ability Of Chatgpt And GPT-4
(2023)
β’ Arxiv
β’ 102 citations
Liu et al.
-
Agentbench: Evaluating Llms As Agents
(2023)
β’ No Venue
Liu et al.
-
A Comprehensive Evaluation Of Chatgpt's Zero-shot Text-to-sql Capability
(2023)
β’ Arxiv
β’ 58 citations
Liu et al.
-
Revisiting Temporal Modeling For Clip-based Image-to-video Knowledge Transferring
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 44 citations
Liu et al.
-
Multi-task Recommendations With Reinforcement Learning
(2023)
β’ IEEE Transactions on Image Processing
β’ 53 citations
Liu et al.
-
Tinygsm: Achieving >80% On Gsm8k With Small Language Models
(2023)
β’ No Venue
Liu et al.
-
CLIP As RNN: Segment Countless Visual Concepts Without Training Endeavor
(2023)
β’ No Venue
Sun et al.
-
C-pack: Packed Resources For General Chinese Embeddings
(2023)
β’ Arxiv
β’ 69 citations
Xiao et al.
-
Inconsistent Matters: A Knowledge-guided Dual-consistency Network For Multi-modal Rumor Detection
(2023)
β’ IEEE Transactions on Knowledge and Data Engineering
β’ 47 citations
Sun et al.
-
The Flan Collection: Designing Data And Methods For Effective Instruction Tuning
(2023)
β’ Arxiv
β’ 109 citations
Longpre et al.
-
Interpretable Long-form Legal Question Answering With Retrieval-augmented Large Language Models
(2023)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 45 citations
Antoine Louis, Gijs van Dijck, Gerasimos Spanakis
-
Visual Language Pretrained Multiple Instance Zero-shot Transfer For Histopathology Images
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 79 citations
Lu et al.
-
Unified-io 2: Scaling Autoregressive Multimodal Models With Vision, Language, Audio, And Action
(2023)
β’ No Venue
Lu et al.
-
Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning
(2023)
β’ 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE)
β’ 71 citations
Lu et al.
-
Level Generation Through Large Language Models
(2023)
β’ FDG 2023: Foundations of Digital Games 2023
β’ 65 citations
Todd et al.
-
Can Chatgpt Reproduce Human-generated Labels? A Study Of Social Computing Tasks
(2023)
β’ Arxiv
β’ 62 citations
Zhu et al.
-
Taiyi: A Bilingual Fine-tuned Large Language Model For Diverse Biomedical Tasks
(2023)
β’ Journal of the American Medical Informatics Association
β’ 41 citations
Luo et al.
-
Llms For Knowledge Graph Construction And Reasoning: Recent Capabilities And Future Opportunities
(2023)
β’ World Wide Web
β’ 130 citations
Zhu et al.
-
CORAL: Expert-curated Medical Oncology Reports To Advance Language Model Inference
(2023)
β’ NEJM AI
β’ 42 citations
Sushil et al.
-
Fingpt: Large Generative Models For A Small Language
(2023)
β’ No Venue
Luukkonen et al.
-
A Transformer-based Model With Self-distillation For Multimodal Emotion Recognition In Conversations
(2023)
β’ IEEE Transactions on Multimedia
β’ 71 citations
Ma et al.
-
Auto-avsr: Audio-visual Speech Recognition With Automatic Labels
(2023)
β’ ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
β’ 91 citations
Ma et al.
-
Text-to-sticker: Style Tailoring Latent Diffusion Models For Human Expression
(2023)
β’ No Venue
Sinha et al.
-
Tidybot: Personalized Robot Assistance With Large Language Models
(2023)
β’ 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
β’ 65 citations
Wu et al.
-
3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment
(2023)
β’ 2023 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 82 citations
Zhu et al.
-
Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine
(2023)
β’ Arxiv
β’ 157 citations
Nori et al.
-
Capabilities Of GPT-4 On Medical Challenge Problems
(2023)
β’ Arxiv
β’ 474 citations
Nori et al.
-
A Comprehensive Overview Of Large Language Models
(2023)
β’ ACM Transactions on Intelligent Systems and Technology
β’ 152 citations
Naveed et al.
-
Culturax: A Cleaned, Enormous, And Multilingual Dataset For Large Language Models In 167 Languages
(2023)
β’ No Venue
Nguyen et al.
-
Hyenadna: Long-range Genomic Sequence Modeling At Single Nucleotide Resolution
(2023)
β’ Arxiv
β’ 140 citations
Nguyen et al.
-
Chatgpt Or Grammarly? Evaluating Chatgpt On Grammatical Error Correction Benchmark
(2023)
β’ Arxiv
β’ 48 citations
Wu et al.
-
Llasm: Large Language And Speech Model
(2023)
β’ No Venue
Shu et al.
-
Towards Geospatial Foundation Models Via Continual Pretraining
(2023)
β’ 2023 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 57 citations
Mendieta et al.
-
Text2kgbench: A Benchmark For Ontology-driven Knowledge Graph Generation From Text
(2023)
β’ Lecture Notes in Computer Science
β’ 51 citations
Mihindukulasooriya et al.
-
Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought
(2023)
β’ Arxiv
β’ 41 citations
Mu et al.
-
Pmc-llama: Towards Building Open-source Language Models For Medicine
(2023)
β’ Journal of the American Medical Informatics Association
β’ 179 citations
Wu et al.
-
Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models
(2023)
β’ Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 224 citations
Maaz et al.
-
Q-instruct: Improving Low-level Visual Abilities For Multi-modality Foundation Models
(2023)
β’ No Venue
Wu et al.
-
Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts
(2023)
β’ 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
β’ 45 citations
Maniparambil et al.
-
Mathcoder: Seamless Code Integration In Llms For Enhanced Mathematical Reasoning
(2023)
β’ No Venue
Wang et al.
-
Tinystories: How Small Can Language Models Be And Still Speak Coherent English?
(2023)
β’ No Venue
Ronen Eldan, Yuanzhi Li
-
From Sparse To Dense: GPT-4 Summarization With Chain Of Density Prompting
(2023)
β’ No Venue
Adams et al.
-
Is Chatgpt A Good NLG Evaluator? A Preliminary Study
(2023)
β’ Proceedings of the 4th New Frontiers in Summarization Workshop
β’ 178 citations
Wang et al.
-
MEGA: Multilingual Evaluation Of Generative AI
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 76 citations
Ahuja et al.
-
Instructuie: Multi-task Instruction Tuning For Unified Information Extraction
(2023)
β’ Arxiv
β’ 46 citations
Wang et al.
-
Docllm: A Layout-aware Generative Language Model For Multimodal Document Understanding
(2023)
β’ No Venue
Wang et al.
-
Improving Text Embeddings With Large Language Models
(2023)
β’ No Venue
Wang et al.
-
Cross-modal Contrastive Learning For Multimodal Fake News Detection
(2023)
β’ Proceedings of the 31st ACM International Conference on Multimedia
β’ 58 citations
Wang et al.
-
Large Language Models Streamline Automated Machine Learning For Clinical Studies
(2023)
β’ Nature Communications
β’ 74 citations
Arasteh et al.
-
Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models
(2023)
β’ No Venue
Awadalla et al.
-
Rasa: Relation And Sensitivity Aware Representation Learning For Text-based Person Search
(2023)
β’ Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
β’ 73 citations
Bai et al.
-
Longbench: A Bilingual, Multitask Benchmark For Long Context Understanding
(2023)
β’ Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 48 citations
Bai et al.
-
Learning To Exploit Temporal Structure For Biomedical Vision-language Processing
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 102 citations
Bannur et al.
-
Codekgc: Code Language Model For Generative Knowledge Graph Construction
(2023)
β’ ACM Transactions on Asian and Low-Resource Language Information Processing
β’ 40 citations
Bi et al.
-
Nougat: Neural Optical Understanding For Academic Documents
(2023)
β’ No Venue
Blecher et al.
-
Spanish Pre-trained BERT Model And Evaluation Data
(2023)
β’ Arxiv
β’ 332 citations
CaΓ±ete et al.
-
Multilora: Democratizing Lora For Better Multi-task Learning
(2023)
β’ No Venue
Wang et al.
-
On The Possibilities Of Ai-generated Text Detection
(2023)
β’ Arxiv
β’ 50 citations
Chakraborty et al.
-
Alpagasus: Training A Better Alpaca With Fewer Data
(2023)
β’ No Venue
Chen et al.
-
Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving
(2023)
β’ 2024 IEEE International Conference on Robotics and Automation (ICRA)
β’ 110 citations
Chen et al.
-
Clip2scene: Towards Label-efficient 3D Scene Understanding By CLIP
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 105 citations
Chen et al.
-
Diversevul: A New Vulnerable Source Code Dataset For Deep Learning Based Vulnerability Detection
(2023)
β’ Proceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses
β’ 136 citations
Chen et al.
-
Hdformer: High-order Directed Transformer For 3D Human Pose Estimation
(2023)
β’ Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}
β’ 44 citations
Chen et al.
-
Modelscope Text-to-video Technical Report
(2023)
β’ Arxiv
β’ 46 citations
Wang et al.
-
Plan-and-solve Prompting: Improving Zero-shot Chain-of-thought Reasoning By Large Language Models
(2023)
β’ Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 149 citations
Wang et al.
-
Internvid: A Large-scale Video-text Dataset For Multimodal Understanding And Generation
(2023)
β’ No Venue
Wang et al.
-
CVT-SLR: Contrastive Visual-textual Transformation For Sign Language Recognition With Variational Alignment
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 72 citations
Zheng et al.
-
GPT-RE: In-context Learning For Relation Extraction Using Large Language Models
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 110 citations
Wan et al.
-
Selformer: Molecular Representation Learning Via SELFIES Language Models
(2023)
β’ Machine Learning: Science and Technology
β’ 43 citations
YΓΌksel et al.
-
Adapointr: Diverse Point Cloud Completion With Adaptive Geometry-aware Transformers
(2023)
β’ IEEE Transactions on Pattern Analysis and Machine Intelligence
β’ 79 citations
Yu et al.
-
Scaling Relationship On Learning Mathematical Reasoning With Large Language Models
(2023)
β’ No Venue
Yuan et al.
-
Recmind: Large Language Model Powered Agent For Recommendation
(2023)
β’ Findings of the Association for Computational Linguistics: NAACL 2024
β’ 43 citations
Wang et al.
-
Clinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts
(2023)
β’ No Venue
Veen et al.
-
SAM On Medical Images: A Comprehensive Study On Three Prompt Modes
(2023)
β’ Arxiv
β’ 53 citations
Cheng et al.
-
Rolellm: Benchmarking, Eliciting, And Enhancing Role-playing Abilities Of Large Language Models
(2023)
β’ Findings of the Association for Computational Linguistics ACL 2024
β’ 51 citations
Wang et al.
-
A Picture Is Worth More Than 77 Text Tokens: Evaluating Clip-style Models On Dense Captions
(2023)
β’ No Venue
Urbanek et al.
-
Selective Structured State-spaces For Long-form Video Understanding
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 70 citations
Wang et al.
-
Agieval: A Human-centric Benchmark For Evaluating Foundation Models
(2023)
β’ Arxiv
β’ 60 citations
Zhong et al.
-
On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective
(2023)
β’ Arxiv
β’ 90 citations
Wang et al.
-
Increasing Diversity While Maintaining Accuracy: Text Data Generation With Large Language Models And Human Interventions
(2023)
β’ Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 43 citations
John Joon Young Chung, Ece Kamar, Saleema Amershi
-
Scaling Robot Learning With Semantically Imagined Experience
(2023)
β’ Robotics: Science and Systems XIX
β’ 57 citations
Yu et al.
-
Lmsys-chat-1m: A Large-scale Real-world LLM Conversation Dataset
(2023)
β’ No Venue
Zheng et al.
-
The Chime-7 DASR Challenge: Distant Meeting Transcription With Multiple Devices In Diverse Scenarios
(2023)
β’ 7th International Workshop on Speech Processing in Everyday Environments (CHiME 2023)
β’ 45 citations
Cornell et al.
-
Efficient And Effective Text Encoding For Chinese Llama And Alpaca
(2023)
β’ Arxiv
β’ 71 citations
Yiming Cui, Ziqing Yang, Xin Yao
-
Vision Grid Transformer For Document Layout Analysis
(2023)
β’ 2023 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 40 citations
da et al.
-
A Survey On Multimodal Large Language Models For Autonomous Driving
(2023)
β’ 2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)
β’ 217 citations
Cui et al.
-
Auggpt: Leveraging Chatgpt For Text Data Augmentation
(2023)
β’ Arxiv
β’ 98 citations
Dai et al.
-
The State Of Human-centered NLP Technology For Fact-checking
(2023)
β’ Information Processing & Management
β’ 55 citations
Das et al.
-
A Decoder-only Foundation Model For Time-series Forecasting
(2023)
β’ Arxiv
β’ 41 citations
Das et al.
-
Multi-modal Self-supervised Learning For Recommendation
(2023)
β’ Proceedings of the ACM Web Conference 2023
β’ 157 citations
Wei et al.
-
K2: A Foundation Language Model For Geoscience Knowledge Understanding And Utilization
(2023)
β’ WSDM '24: The 17th ACM International Conference on Web Search and Data Mining
β’ 48 citations
Deng et al.
-
Enhancing Chat Language Models By Scaling High-quality Instructional Conversations
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 60 citations
Ding et al.
-
Misrob{\ae}rta: Transformers Versus Misinformation
(2023)
β’ Mathematics
β’ 41 citations
Ciprian-Octavian TruicΔ, Elena-Simona Apostol
-
Lp-musiccaps: Llm-based Pseudo Music Captioning
(2023)
β’ No Venue
Doh et al.
-
Ferret: Refer And Ground Anything Anywhere At Any Granularity
(2023)
β’ Arxiv
β’ 43 citations
You et al.
-
Enhancing Job Recommendation Through Llm-based Generative Adversarial Networks
(2023)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 49 citations
Du et al.
-
DNABERT-2: Efficient Foundation Model And Benchmark For Multi-species Genome
(2023)
β’ Arxiv
β’ 139 citations
Zhou et al.
-
Lmdrive: Closed-loop End-to-end Driving With Large Language Models
(2023)
β’ 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 76 citations
Shao et al.
-
Pdftriage: Question Answering Over Long, Structured Documents
(2023)
β’ No Venue
Saad-Falcon et al.
-
Detecting And Grounding Multi-modal Media Manipulation
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 52 citations
Rui Shao, Tianxing Wu, Ziwei Liu
-
GPT-3.5, GPT-4, Or BARD? Evaluating Llms Reasoning Ability In Zero-shot Setting And Performance Boosting Through Prompts
(2023)
β’ Natural Language Processing Journal
β’ 69 citations
Espejel et al.
-
Unified Pre-training With Pseudo Texts For Text-to-image Person Re-identification
(2023)
β’ 2023 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 48 citations
Shao et al.
-
A Multi-task Multi-stage Transitional Training Framework For Neural Chat Translation
(2023)
β’ Proceedings of the 2023 ACM International Conference on Multimedia Retrieval
β’ 48 citations
Zhou et al.
-
Fedmultimodal: A Benchmark For Multimodal Federated Learning
(2023)
β’ Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
β’ 59 citations
Feng et al.
-
Semeval-2023 Task 2: Fine-grained Multilingual Named Entity Recognition (multiconer 2)
(2023)
β’ Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023)
β’ 43 citations
Fetahu et al.
-
Multiconer V2: A Large Multilingual Dataset For Fine-grained And Noisy Named Entity Recognition
(2023)
β’ Findings of the Association for Computational Linguistics: EMNLP 2023
β’ 42 citations
Fetahu et al.
-
Medalign: A Clinician-generated Dataset For Instruction Following With Electronic Medical Records
(2023)
β’ No Venue
Fleming et al.
-
Mathematical Capabilities Of Chatgpt
(2023)
β’ NeurIPS 2023 Datasets and Benchmarks
β’ 293 citations
Frieder et al.
-
Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We?
(2023)
β’ 2023 30th Asia-Pacific Software Engineering Conference (APSEC)
β’ 56 citations
Fu et al.
-
Datacomp: In Search Of The Next Generation Of Multimodal Datasets
(2023)
β’ Arxiv
β’ 72 citations
Gadre et al.
-
Bias And Fairness In Large Language Models: A Survey
(2023)
β’ Computational Linguistics
β’ 255 citations
Gallegos et al.
-
Distil-whisper: Robust Knowledge Distillation Via Large-scale Pseudo Labelling
(2023)
β’ No Venue
Sanchit Gandhi, Patrick von Platen, Alexander M. Rush
-
A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models
(2023)
β’ Arxiv
β’ 181 citations
Ye et al.
-
On The Origin Of Llms: An Evolutionary Tree And Graph For 15,821 Large Language Models
(2023)
β’ No Venue
Sarah Gao, Andrew Kean Gao
-
G-llava: Solving Geometric Problem With Multi-modal Large Language Model
(2023)
β’ No Venue
Gao et al.
-
Funasr: A Fundamental End-to-end Speech Recognition Toolkit
(2023)
β’ INTERSPEECH 2023
β’ 44 citations
Gao et al.
-
Large Language Models Are Versatile Decomposers: Decompose Evidence And Questions For Table-based Reasoning
(2023)
β’ SIGIR '23: The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
β’ 43 citations
Ye et al.
-
A Picture Is Worth A Thousand Words: Principled Recaptioning Improves Image Generation
(2023)
β’ No Venue
Segalis et al.
-
Flacuna: Unleashing The Problem Solving Power Of Vicuna Using FLAN Fine-tuning
(2023)
β’ No Venue
Ghosal et al.
-
Can Chatgpt Replace Traditional KBQA Models? An In-depth Analysis Of The Question Answering Performance Of The GPT LLM Family
(2023)
β’ Lecture Notes in Computer Science
β’ 66 citations
Tan et al.
-
Adding Conditional Control To Text-to-image Diffusion Models
(2023)
β’ 2023 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 2580 citations
Lvmin Zhang, Anyi Rao, Maneesh Agrawala
-
Medagents: Large Language Models As Collaborators For Zero-shot Medical Reasoning
(2023)
β’ Findings of the Association for Computational Linguistics ACL 2024
β’ 53 citations
Tang et al.
-
PIPPA: A Partially Synthetic Conversational Dataset
(2023)
β’ No Venue
Tear Gosling, Alpin Dale, Yinhe Zheng
-
Detecting And Preventing Hallucinations In Large Vision Language Models
(2023)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 93 citations
Anisha Gunjal, Jihan Yin, Erhan Bas
-
Text With Knowledge Graph Augmented Transformer For Video Captioning
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 57 citations
Gu et al.
-
Editing Large Language Models: Problems, Methods, And Opportunities
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 55 citations
Yao et al.
-
Legalbench: A Collaboratively Built Benchmark For Measuring Legal Reasoning In Large Language Models
(2023)
β’ SSRN Electronic Journal
β’ 77 citations
Guha et al.
-
Verigen: A Large Language Model For Verilog Code Generation
(2023)
β’ ACM Transactions on Design Automation of Electronic Systems
β’ 129 citations
Thakur et al.
-
Connecting Large Language Models With Evolutionary Algorithms Yields Powerful Prompt Optimizers
(2023)
β’ No Venue
Guo et al.
-
PPTC Benchmark: Evaluating Large Language Models For Powerpoint Task Completion
(2023)
β’ No Venue
Guo et al.
-
Chatie: Zero-shot Information Extraction Via Chatting With Chatgpt
(2023)
β’ Arxiv
β’ 141 citations
Wei et al.
-
Medalpaca -- An Open-source Collection Of Medical Conversational AI Models And Training Data
(2023)
β’ Arxiv
β’ 102 citations
Han et al.
-
Stylegan-t: Unlocking The Power Of Gans For Fast Large-scale Text-to-image Synthesis
(2023)
β’ Arxiv
β’ 59 citations
Sauer et al.
-
Leveraging Large Language Models For Sequential Recommendation
(2023)
β’ RecSys '23: Seventeenth ACM Conference on Recommender Systems
β’ 88 citations
Harte et al.
-
Annollm: Making Large Language Models To Be Better Crowdsourced Annotators
(2023)
β’ Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 6: Industry Track)
β’ 46 citations
He et al.
-
Align And Attend: Multimodal Summarization With Dual Contrastive Losses
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 72 citations
He et al.
-
Reinforcement Learning-based Counter-misinformation Response Generation: A Case Study Of COVID-19 Vaccine Misinformation
(2023)
β’ Proceedings of the ACM Web Conference 2023
β’ 44 citations
Bing He, Mustaque Ahamad, Srijan Kumar
-
A Survey On Uncertainty Quantification Methods For Deep Learning
(2023)
β’ Arxiv
β’ 50 citations
He et al.
-
From Words To Watts: Benchmarking The Energy Costs Of Large Language Model Inference
(2023)
β’ 2023 IEEE High Performance Extreme Computing Conference (HPEC)
β’ 104 citations
Samsi et al.
-
Biomedclip: A Multimodal Biomedical Foundation Model Pretrained From Fifteen Million Scientific Image-text Pairs
(2023)
β’ Arxiv
β’ 87 citations
Zhang et al.
-
Copiloting The Copilots: Fusing Large Language Models With Completion Engines For Automated Program Repair
(2023)
β’ ESEC/FSE '23: 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
β’ 75 citations
Yuxiang Wei, Chunqiu Steven Xia, Lingming Zhang
-
LRM: Large Reconstruction Model For Single Image To 3D
(2023)
β’ No Venue
Hong et al.
-
Large Language Models Are Zero-shot Rankers For Recommender Systems
(2023)
β’ Lecture Notes in Computer Science
β’ 155 citations
Hou et al.
-
Bad Actor, Good Advisor: Exploring The Role Of Large Language Models In Fake News Detection
(2023)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 104 citations
Hu et al.
-
RSGPT: A Remote Sensing Vision Language Model And Benchmark
(2023)
β’ ISPRS Journal of Photogrammetry and Remote Sensing
β’ 46 citations
Hu et al.
-
Llm-adapters: An Adapter Family For Parameter-efficient Fine-tuning Of Large Language Models
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 128 citations
Hu et al.
-
Vid2seq: Large-scale Pretraining Of A Visual Language Model For Dense Video Captioning
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 179 citations
Yang et al.
-
Towards Interpretable Mental Health Analysis With Large Language Models
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 84 citations
Yang et al.
-
How To Do Things With Deep Learning Code
(2023)
β’ Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region
β’ 41 citations
Minh Hua, Rita Raley
-
Swin3d: A Pretrained Transformer Backbone For 3D Indoor Scene Understanding
(2023)
β’ Computational Visual Media
β’ 47 citations
Yang et al.
-
C-eval: A Multi-level Multi-discipline Chinese Evaluation Suite For Foundation Models
(2023)
β’ Arxiv
β’ 89 citations
Huang et al.
-
Segment And Caption Anything
(2023)
β’ No Venue
Huang et al.
-
Diversity-aware Meta Visual Prompting
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 41 citations
Huang et al.
-
Make-an-audio: Text-to-audio Generation With Prompt-enhanced Diffusion Models
(2023)
β’ Arxiv
β’ 46 citations
Huang et al.
-
Med-halt: Medical Domain Hallucination Test For Large Language Models
(2023)
β’ Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL)
β’ 54 citations
Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu
-
GPQA: A Graduate-level Google-proof Q&A Benchmark
(2023)
β’ No Venue
Rein et al.
-
Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition
(2023)
β’ No Venue
Zhou et al.
-
Exploring The Limits Of Chatgpt For Query Or Aspect-based Text Summarization
(2023)
β’ Arxiv
β’ 89 citations
Yang et al.
-
Fusecap: Leveraging Large Language Models For Enriched Fused Image Captions
(2023)
β’ 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
β’ 47 citations
Rotstein et al.
-
Starvector: Generating Scalable Vector Graphics Code From Images
(2023)
β’ No Venue
Rodriguez et al.
-
Quilt-1m: One Million Image-text Pairs For Histopathology
(2023)
β’ Arxiv
β’ 52 citations
Ikezogwo et al.
-
From Image To Language: A Critical Analysis Of Visual Question Answering (VQA) Approaches, Challenges, And Opportunities
(2023)
β’ Information Fusion
β’ 58 citations
Ishmam et al.
-
Conceptfusion: Open-set Multimodal 3D Mapping
(2023)
β’ Robotics: Science and Systems XIX
β’ 142 citations
Jatavallabhula et al.
-
Camels In A Changing Climate: Enhancing LM Adaptation With Tulu 2
(2023)
β’ No Venue
Ivison et al.
-
A Comprehensive Evaluation Of Large Language Models On Benchmark Biomedical Text Processing Tasks
(2023)
β’ Computers in Biology and Medicine
β’ 61 citations
Jahan et al.
-
Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 64 citations
Jiang et al.
-
Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion
(2023)
β’ Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 61 citations
Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
-
Active Retrieval Augmented Generation
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 216 citations
Jiang et al.
-
Clip-count: Towards Text-guided Zero-shot Object Counting
(2023)
β’ Proceedings of the 31st ACM International Conference on Multimedia
β’ 50 citations
Ruixiang Jiang, Lingbo Liu, Changwen Chen
-
Pixellm: Pixel Reasoning With Large Multimodal Model
(2023)
β’ 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 42 citations
Ren et al.
-
Large Language Models As Zero-shot Human Models For Human-robot Interaction
(2023)
β’ 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
β’ 52 citations
Bowen Zhang, Harold Soh
-
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
(2023)
β’ Arxiv
β’ 111 citations
Zhang et al.
-
Sabi\'a: Portuguese Large Language Models
(2023)
β’ Lecture Notes in Computer Science
β’ 46 citations
Pires et al.
-
Video-llava: Learning United Visual Representation By Alignment Before Projection
(2023)
β’ No Venue
Lin et al.
-
DIN-SQL: Decomposed In-context Learning Of Text-to-sql With Self-correction
(2023)
β’ Arxiv
β’ 53 citations
Mohammadreza Pourreza, Davood Rafiei
-
Univtg: Towards Unified Video-language Temporal Grounding
(2023)
β’ 2023 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 73 citations
Lin et al.
-
Crowdclip: Unsupervised Crowd Counting Via Vision-language Model
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 58 citations
Liang et al.
-
Segment Everything Everywhere All At Once
(2023)
β’ Arxiv
β’ 151 citations
Zou et al.
-
PMC-CLIP: Contrastive Language-image Pre-training Using Biomedical Documents
(2023)
β’ Lecture Notes in Computer Science
β’ 110 citations
Lin et al.
-
Text Is All You Need: Learning Language Representations For Sequential Recommendation
(2023)
β’ Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
β’ 103 citations
Li et al.
-
Teach Llms To Personalize -- An Approach Inspired By Writing Education
(2023)
β’ No Venue
Li et al.
-
Revisiting K-nn For Fine-tuning Pre-trained Language Models
(2023)
β’ Proceedings of the 31st ACM International Conference on Multimedia
β’ 61 citations
Li et al.
-
Seed-bench: Benchmarking Multimodal Llms With Generative Comprehension
(2023)
β’ 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 92 citations
Li et al.
-
Skcoder: A Sketch-based Approach For Automatic Code Generation
(2023)
β’ 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)
β’ 42 citations
Li et al.
-
Synthetic Data Generation With Large Language Models For Text Classification: Potential And Limitations
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 70 citations
Li et al.
-
Videochat: Chat-centric Video Understanding
(2023)
β’ Arxiv
β’ 90 citations
Li et al.
-
Unsafe Diffusion: On The Generation Of Unsafe Images And Hateful Memes From Text-to-image Models
(2023)
β’ Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security
β’ 42 citations
Qu et al.
-
Llmrec: Large Language Models With Graph Augmentation For Recommendation
(2023)
β’ WSDM '24: The 17th ACM International Conference on Web Search and Data Mining
β’ 134 citations
Wei et al.
-
Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 65 citations
Xu et al.
-
Pointllm: Empowering Large Language Models To Understand Point Clouds
(2023)
β’ Lecture Notes in Computer Science
β’ 45 citations
Xu et al.
-
Multi: Efficient Video-and-language Understanding With Text-guided Multiway-sampler And Multiple Choice Modeling
(2023)
β’ Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
β’ 50 citations
Xu et al.
-
Imagereward: Learning And Evaluating Human Preferences For Text-to-image Generation
(2023)
β’ Arxiv
β’ 99 citations
Xu et al.
-
Demystifying CLIP Data
(2023)
β’ No Venue
Xu et al.
-
Knowledge-enhanced Visual-language Pre-training On Chest Radiology Images
(2023)
β’ Nature Communications
β’ 134 citations
Zhang et al.
-
Learning Disentangled Semantic Spaces Of Explanations Via Invertible Neural Networks
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 79 citations
Yingji Zhang, Danilo S. Carvalho, AndrΓ© Freitas
-
Magicbrush: A Manually Annotated Dataset For Instruction-guided Image Editing
(2023)
β’ No Venue
Zhang et al.
-
Scaling Up Gans For Text-to-image Synthesis
(2023)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 313 citations
Kang et al.
-
SMART-LLM: Smart Multi-agent Robot Task Planning Using Large Language Models
(2023)
β’ 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
β’ 79 citations
Shyam Sundar Kannan, Vishnunandan L. N. Venkatesh, Byung-Cheol Min
-
Language-driven Representation Learning For Robotics
(2023)
β’ Robotics: Science and Systems XIX
β’ 47 citations
Karamcheti et al.
-
Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior
(2023)
β’ No Venue
Khandelwal et al.
-
Gptaraeval: A Comprehensive Evaluation Of Chatgpt On Arabic NLP
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 41 citations
Khondaker et al.
-
Prometheus: Inducing Fine-grained Evaluation Capability In Language Models
(2023)
β’ No Venue
Kim et al.
-
A Survey Of Learning-based Automated Program Repair
(2023)
β’ ACM Transactions on Software Engineering and Methodology
β’ 77 citations
Zhang et al.
-
Is Chatgpt A General-purpose Natural Language Processing Task Solver?
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 410 citations
Qin et al.
-
The Troubling Emergence Of Hallucination In Large Language Models -- An Extensive Definition, Quantification, And Prescriptive Remediations
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 56 citations
Rawte et al.
-
Pick-a-pic: An Open Dataset Of User Preferences For Text-to-image Generation
(2023)
β’ Arxiv
β’ 41 citations
Kirstain et al.
-
Gender Bias And Stereotypes In Large Language Models
(2023)
β’ CI '23: Collective Intelligence Conference
β’ 216 citations
Hadas Kotek, Rikker Dockum, David Q. Sun
-
RS5M And Georsclip: A Large Scale Vision-language Dataset And A Large Vision-language Model For Remote Sensing
(2023)
β’ IEEE Transactions on Geoscience and Remote Sensing
β’ 43 citations
Zhang et al.
-
Vision Language Models In Autonomous Driving: A Survey And Outlook
(2023)
β’ IEEE Transactions on Intelligent Vehicles
β’ 46 citations
Zhou et al.
-
Personalize Segment Anything Model With One Shot
(2023)
β’ Arxiv
β’ 65 citations
Zhang et al.
-
Sentiment Analysis In The Era Of Large Language Models: A Reality Check
(2023)
β’ Findings of the Association for Computational Linguistics: NAACL 2024
β’ 161 citations
Zhang et al.
-
MADLAD-400: A Multilingual And Document-level Large Audited Dataset
(2023)
β’ No Venue
Kudugunta et al.
-
Geochat: Grounded Large Vision-language Model For Remote Sensing
(2023)
β’ 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 102 citations
Kuckreja et al.
-
Summit: Iterative Text Summarization Via Chatgpt
(2023)
β’ Findings of the Association for Computational Linguistics: EMNLP 2023
β’ 42 citations
Haopeng Zhang, Xiao Liu, Jiawei Zhang
-
Glamm: Pixel Grounding Large Multimodal Model
(2023)
β’ No Venue
Rasheed et al.
-
Chatgpt: Beginning Of An End Of Manual Linguistic Data Annotation? Use Case Of Automatic Genre Identification
(2023)
β’ Arxiv
β’ 64 citations
Taja Kuzman, Igor MozetiΔ, Nikola LjubeΕ‘iΔ
-
Vision-language Models For Vision Tasks: A Survey
(2023)
β’ IEEE Transactions on Pattern Analysis and Machine Intelligence
β’ 403 citations
Zhang et al.
-
LISA: Reasoning Segmentation Via Large Language Model
(2023)
β’ 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 205 citations
Lai et al.
-
Chatgpt Beyond English: Towards A Comprehensive Evaluation Of Large Language Models In Multilingual Learning
(2023)
β’ Findings of the Association for Computational Linguistics: EMNLP 2023
β’ 91 citations
Lai et al.
-
Evaluation Of Chatgpt For Nlp-based Mental Health Applications
(2023)
β’ Arxiv
β’ 54 citations
Bishal Lamichhane
-
A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets
(2023)
β’ Findings of the Association for Computational Linguistics: ACL 2023
β’ 69 citations
Laskar et al.
-
OBELICS: An Open Web-scale Filtered Dataset Of Interleaved Image-text Documents
(2023)
β’ No Venue
LaurenΓ§on et al.
-
The Bigscience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
(2023)
β’ Arxiv
β’ 65 citations
LaurenΓ§on et al.
-
Platypus: Quick, Cheap, And Powerful Refinement Of Llms
(2023)
β’ No Venue
Ariel N. Lee, Cole J. Hunter, Nataniel Ruiz
-
Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis
(2023)
β’ No Venue
Qin et al.
-
In Chatgpt We Trust? Measuring And Characterizing The Reliability Of Chatgpt
(2023)
β’ Arxiv
β’ 71 citations
Shen et al.
-
Video-llama: An Instruction-tuned Audio-visual Language Model For Video Understanding
(2023)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
β’ 367 citations
Hang Zhang, Xin Li, Lidong Bing
-
Filter-enhanced MLP Is All You Need For Sequential Recommendation
(2022)
β’ Proceedings of the ACM Web Conference 2022
β’ 283 citations
Zhou et al.
-
Few-shot Class-incremental Learning By Sampling Multi-phase Tasks
(2022)
β’ IEEE Transactions on Pattern Analysis and Machine Intelligence
β’ 108 citations
Zhou et al.
-
ELEVATER: A Benchmark And Toolkit For Evaluating Language-augmented Visual Models
(2022)
β’ Arxiv
β’ 64 citations
Li et al.
-
Compositional Temporal Grounding With Structured Variational Cross-graph Correspondence Learning
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 57 citations
Li et al.
-
Automating Code Review Activities By Large-scale Pre-training
(2022)
β’ Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
β’ 129 citations
Li et al.
-
BLIP: Bootstrapping Language-image Pre-training For Unified Vision-language Understanding And Generation
(2022)
β’ Arxiv
β’ 850 citations
Li et al.
-
CLMLF:A Contrastive Learning And Multi-layer Fusion Method For Multimodal Sentiment Detection
(2022)
β’ Findings of the Association for Computational Linguistics: NAACL 2022
β’ 85 citations
Li et al.
-
Envedit: Environment Editing For Vision-and-language Navigation
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 64 citations
Jialu Li, Hao Tan, Mohit Bansal
-
User-centric Conversational Recommendation With Multi-aspect User Modeling
(2022)
β’ Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
β’ 52 citations
Li et al.
-
Detect Rumors In Microblog Posts For Low-resource Domains Via Adversarial Contrastive Learning
(2022)
β’ Findings of the Association for Computational Linguistics: NAACL 2022
β’ 41 citations
Lin et al.
-
Data Cards: Purposeful And Transparent Dataset Documentation For Responsible AI
(2022)
β’ 2022 ACM Conference on Fairness Accountability and Transparency
β’ 140 citations
Mahima Pushkarna, Andrew Zaldivar, Oddur Kjartansson
-
Visual-language Navigation Pretraining Via Prompt-based Environmental Self-exploration
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 46 citations
Liang et al.
-
Proposalclip: Unsupervised Open-category Object Proposal Generation Via Exploiting CLIP Cues
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 51 citations
Shi et al.
-
Egocentric Video-language Pretraining
(2022)
β’ Arxiv
β’ 45 citations
Lin et al.
-
RASAT: Integrating Relational Structures Into Pretrained Seq2seq Model For Text-to-sql
(2022)
β’ Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
β’ 60 citations
Qi et al.
-
Transrac: Encoding Multi-scale Temporal Correlation With Transformers For Repetitive Action Counting
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 56 citations
Hu et al.
-
Large Language Models Can Self-improve
(2022)
β’ Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
β’ 96 citations
Huang et al.
-
Swintextspotter: Scene Text Spotting Via Better Synergy Between Text Detection And Text Recognition
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 128 citations
Huang et al.
-
POLITICS: Pretraining With Same-story Article Comparison For Ideology Prediction And Stance Detection
(2022)
β’ Findings of the Association for Computational Linguistics: NAACL 2022
β’ 40 citations
Liu et al.
-
Tganet: Text-guided Attention For Improved Polyp Segmentation
(2022)
β’ Lecture Notes in Computer Science
β’ 141 citations
Tomar et al.
-
Discovering Language Model Behaviors With Model-written Evaluations
(2022)
β’ Findings of the Association for Computational Linguistics: ACL 2023
β’ 47 citations
Perez et al.
-
A Prompting-based Approach For Adversarial Example Generation And Robustness Enhancement
(2022)
β’ Proceedings of the 30th ACM International Conference on Multimedia
β’ 108 citations
Yang et al.
-
Zero-shot Video Question Answering Via Frozen Bidirectional Language Models
(2022)
β’ Arxiv
β’ 64 citations
Yang et al.
-
Robots Enact Malignant Stereotypes
(2022)
β’ 2022 ACM Conference on Fairness Accountability and Transparency
β’ 43 citations
Hundt et al.
-
Chinese CLIP: Contrastive Vision-language Pretraining In Chinese
(2022)
β’ Arxiv
β’ 51 citations
Yang et al.
-
Scaling Up Models And Data With \(\texttt{t5x}\) And \(\texttt{seqio}\)
(2022)
β’ Arxiv
β’ 47 citations
Roberts et al.
-
Entity-enhanced Adaptive Reconstruction Network For Weakly Supervised Referring Expression Grounding
(2022)
β’ IEEE Transactions on Pattern Analysis and Machine Intelligence
β’ 48 citations
Liu et al.
-
Codefill: Multi-token Code Completion By Jointly Learning From Structure And Naming Sequences
(2022)
β’ ICSE '22: 44th International Conference on Software Engineering
β’ 67 citations
Maliheh Izadi, Roberta Gismondi, Georgios Gousios
-
Effective Token Graph Modeling Using A Novel Labeling Strategy For Structured Sentiment Analysis
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 71 citations
Shi et al.
-
On The Importance Of Building High-quality Training Datasets For Neural Code Search
(2022)
β’ Proceedings of the 44th International Conference on Software Engineering
β’ 61 citations
Sun et al.
-
UMT: Unified Multi-modal Transformers For Joint Video Moment Retrieval And Highlight Detection
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 125 citations
Liu et al.
-
Reducing The Vision And Language Bias For Temporal Sentence Grounding
(2022)
β’ Proceedings of the 30th ACM International Conference on Multimedia
β’ 45 citations
Daizong Liu, Xiaoye Qu, Wei Hu
-
Asymmetric Cross-scale Alignment For Text-based Person Search
(2022)
β’ IEEE Transactions on Multimedia
β’ 54 citations
Ji et al.
-
Partslip: Low-shot Part Segmentation For 3D Point Clouds Via Pretrained Image-language Models
(2022)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 46 citations
Liu et al.
-
Pseudo-q: Generating Pseudo Language Queries For Visual Grounding
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 54 citations
Jiang et al.
-
Automatic Text Summarization Methods: A Comprehensive Review
(2022)
β’ SN Computer Science
β’ 68 citations
Divakar Yadav, Jalpa Desai, Arun Kumar Yadav
-
GL-RG: Global-local Representation Granularity For Video Captioning
(2022)
β’ Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
β’ 52 citations
Yan et al.
-
Chart-to-text: A Large-scale Benchmark For Chart Summarization
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 70 citations
Kantharaj et al.
-
Local-global Context Aware Transformer For Language-guided Video Segmentation
(2022)
β’ IEEE Transactions on Pattern Analysis and Machine Intelligence
β’ 72 citations
Liang et al.
-
Fantastic Questions And Where To Find Them: Fairytaleqa -- An Authentic Dataset For Narrative Comprehension
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 48 citations
Xu et al.
-
Prosocialdialog: A Prosocial Backbone For Conversational Agents
(2022)
β’ Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
β’ 42 citations
Kim et al.
-
Perturbation Augmentation For Fairer NLP
(2022)
β’ Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
β’ 50 citations
Qian et al.
-
Clip-vip: Adapting Pre-trained Image-text Model To Video-language Representation Alignment
(2022)
β’ Arxiv
β’ 53 citations
Xue et al.
-
ULIP: Learning A Unified Representation Of Language, Images, And Point Clouds For 3D Understanding
(2022)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 169 citations
Xue et al.
-
Leveraging Language Foundation Models For Human Mobility Forecasting
(2022)
β’ Proceedings of the 30th International Conference on Advances in Geographic Information Systems
β’ 48 citations
Hao Xue, Bhanu Prakash Voutharoja, Flora D. Salim
-
An Empirical Survey On Long Document Summarization: Datasets, Models And Metrics
(2022)
β’ ACM Computing Surveys
β’ 69 citations
Koh et al.
-
A Two-stream Amr-enhanced Model For Document-level Event Argument Extraction
(2022)
β’ Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 47 citations
Xu et al.
-
Beyond A Pre-trained Object Detector: Cross-modal Textual And Visual Context For Image Captioning
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 85 citations
Chia-Wen Kuo, Zsolt Kira
-
Multi-task Learning With Multi-query Transformer For Dense Prediction
(2022)
β’ IEEE Transactions on Circuits and Systems for Video Technology
β’ 45 citations
Xu et al.
-
Groupvit: Semantic Segmentation Emerges From Text Supervision
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 352 citations
Xu et al.
-
Learn From Structural Scope: Improving Aspect-level Sentiment Analysis With Hybrid Graph Convolutional Networks
(2022)
β’ Neurocomputing
β’ 45 citations
Xu et al.
-
PEER: A Comprehensive And Multi-task Benchmark For Protein Sequence Understanding
(2022)
β’ Arxiv
β’ 58 citations
Xu et al.
-
Multihiertt: Numerical Reasoning Over Multi Hierarchical Tabular And Textual Data
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 46 citations
Zhao et al.
-
Coauthor: Designing A Human-ai Collaborative Writing Dataset For Exploring Language Model Capabilities
(2022)
β’ CHI '22: CHI Conference on Human Factors in Computing Systems
β’ 223 citations
Mina Lee, Percy Liang, Qian Yang
-
Improving Mispronunciation Detection With Wav2vec2-based Momentum Pseudo-labeling For Accentedness And Intelligibility Assessment
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 129 citations
Yang et al.
-
Empathetic Conversational Systems: A Review Of Current Advances, Gaps, And Opportunities
(2022)
β’ IEEE Transactions on Affective Computing
β’ 44 citations
Aravind Sesagiri Raamkumar, Yinping Yang
-
Benchmarking Large Language Models For Automated Verilog RTL Code Generation
(2022)
β’ 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)
β’ 112 citations
Thakur et al.
-
Progen2: Exploring The Boundaries Of Protein Language Models
(2022)
β’ Cell Systems
β’ 286 citations
Nijkamp et al.
-
NADI 2022: The Third Nuanced Arabic Dialect Identification Shared Task
(2022)
β’ Proceedings of the The Seventh Arabic Natural Language Processing Workshop (WANLP)
β’ 48 citations
Abdul-Mageed et al.
-
A Few Thousand Translations Go A Long Way! Leveraging Pre-trained Models For African News Translation
(2022)
β’ Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 44 citations
Adelani et al.
-
Learning Audio-video Modalities From Image Captions
(2022)
β’ Lecture Notes in Computer Science
β’ 48 citations
Nagrani et al.
-
Large Language Models Are Few-shot Clinical Information Extractors
(2022)
β’ Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
β’ 212 citations
Agrawal et al.
-
Chemberta-2: Towards Chemical Foundation Models
(2022)
β’ Arxiv
β’ 120 citations
Ahmad et al.
-
MTEB: Massive Text Embedding Benchmark
(2022)
β’ Arxiv
β’ 57 citations
Muennighoff et al.
-
USB: A Unified Semi-supervised Learning Benchmark For Classification
(2022)
β’ Arxiv
β’ 42 citations
Wang et al.
-
Towards Data-efficient Detection Transformers
(2022)
β’ Lecture Notes in Computer Science
β’ 52 citations
Wang et al.
-
Self-consistency Improves Chain Of Thought Reasoning In Language Models
(2022)
β’ Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 365 citations
Wang et al.
-
Scene Text Recognition With Permuted Autoregressive Sequence Models
(2022)
β’ Lecture Notes in Computer Science
β’ 183 citations
Darwin Bautista, Rowel Atienza
-
Text Embeddings By Weakly-supervised Contrastive Pre-training
(2022)
β’ Arxiv
β’ 107 citations
Wang et al.
-
Simkgc: Simple Contrastive Knowledge Graph Completion With Pre-trained Language Models
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 151 citations
Wang et al.
-
Refined: An Efficient Zero-shot-capable Approach To End-to-end Entity Linking
(2022)
β’ Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track
β’ 53 citations
Ayoola et al.
-
Multimae: Multi-modal Multi-task Masked Autoencoders
(2022)
β’ Lecture Notes in Computer Science
β’ 186 citations
Bachmann et al.
-
Promptsource: An Integrated Development Environment And Repository For Natural Language Prompts
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
β’ 148 citations
Bach et al.
-
Training A Helpful And Harmless Assistant With Reinforcement Learning From Human Feedback
(2022)
β’ Arxiv
β’ 346 citations
Bai et al.
-
Medclip: Contrastive Learning From Unpaired Medical Images And Text
(2022)
β’ Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
β’ 402 citations
Wang et al.
-
Building Machine Translation Systems For The Next Thousand Languages
(2022)
β’ Arxiv
β’ 43 citations
Bapna et al.
-
Lila: A Unified Benchmark For Mathematical Reasoning
(2022)
β’ Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
β’ 70 citations
Mishra et al.
-
RT-1: Robotics Transformer For Real-world Control At Scale
(2022)
β’ Robotics: Science and Systems 2023
β’ 372 citations
Brohan et al.
-
Prompting GPT-3 To Be Reliable
(2022)
β’ Arxiv
β’ 68 citations
Si et al.
-
Numglue: A Suite Of Fundamental Yet Challenging Mathematical Reasoning Tasks
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 117 citations
Mishra et al.
-
Are Transformers Effective For Time Series Forecasting?
(2022)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 1544 citations
Zeng et al.
-
Exploiting Unlabeled Data With Vision And Language Models For Object Detection
(2022)
β’ Lecture Notes in Computer Science
β’ 74 citations
Zhao et al.
-
A Model-agnostic Data Manipulation Method For Persona-based Dialogue Generation
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 40 citations
Cao et al.
-
Open-vocabulary DETR With Conditional Matching
(2022)
β’ Lecture Notes in Computer Science
β’ 155 citations
Zang et al.
-
Generating Data To Mitigate Spurious Correlations In Natural Language Inference Datasets
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 43 citations
Wu et al.
-
Roentgen: Vision-language Foundation Model For Chest X-ray Generation
(2022)
β’ Arxiv
β’ 55 citations
Chambon et al.
-
Adapting Pretrained Vision-language Foundational Models To Medical Imaging Domains
(2022)
β’ Foundation Models for Decision Making Workshop at Neural Information Processing Systems 2022
β’ 43 citations
Chambon et al.
-
Unified Vision And Language Prompt Learning
(2022)
β’ Arxiv
β’ 54 citations
Zang et al.
-
Large Language Models Are Few(1)-shot Table Reasoners
(2022)
β’ Findings of the Association for Computational Linguistics: EACL 2023
β’ 41 citations
Wenhu Chen
-
Gatehub: Gated History Unit With Background Suppression For Online Action Detection
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 44 citations
Chen et al.
-
Murag: Multimodal Retrieval-augmented Generator For Open Question Answering Over Images And Text
(2022)
β’ Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
β’ 65 citations
Chen et al.
-
Program Of Thoughts Prompting: Disentangling Computation From Reasoning For Numerical Reasoning Tasks
(2022)
β’ Arxiv
β’ 110 citations
Chen et al.
-
A Simple Multi-modality Transfer Learning Baseline For Sign Language Translation
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 120 citations
Chen et al.
-
What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data
(2022)
β’ IEEE Robotics and Automation Letters
β’ 49 citations
Oier Mees, Lukas Hermann, Wolfram Burgard
-
Bidirectional Cross-modal Knowledge Exploration For Video Recognition With Pre-trained Vision-language Models
(2022)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 75 citations
Wu et al.
-
Locating And Editing Factual Associations In GPT
(2022)
β’ Arxiv
β’ 172 citations
Meng et al.
-
Winoground: Probing Vision And Language Models For Visio-linguistic Compositionality
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 132 citations
Thrush et al.
-
Promda: Prompt-based Data Augmentation For Low-resource NLU Tasks
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 61 citations
Wang et al.
-
X-trans2cap: Cross-modal Knowledge Transfer Using Transformer For 3D Dense Captioning
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 62 citations
Yuan et al.
-
Phenaki: Variable Length Video Generation From Open Domain Textual Description
(2022)
β’ Arxiv
β’ 78 citations
Villegas et al.
-
The Moral Integrity Corpus: A Benchmark For Ethical Dialogue Systems
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 53 citations
Ziems et al.
-
Ernie-layout: Layout Knowledge Enhanced Pre-training For Visually-rich Document Understanding
(2022)
β’ Findings of the Association for Computational Linguistics: EMNLP 2022
β’ 53 citations
Peng et al.
-
Medmcqa : A Large-scale Multi-subject Multi-choice Dataset For Medical Domain Question Answering
(2022)
β’ ACM Conference on Health Inference and Learning (CHIL) 2022
β’ 72 citations
Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu
-
Heterogeneous Ensemble Knowledge Transfer For Training Large Models In Federated Learning
(2022)
β’ Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
β’ 82 citations
Cho et al.
-
Fine-grained Image Captioning With CLIP Reward
(2022)
β’ Findings of the Association for Computational Linguistics: NAACL 2022
β’ 52 citations
Cho et al.
-
Multiconer: A Large-scale Multilingual Dataset For Complex Named Entity Recognition
(2022)
β’ Arxiv
β’ 82 citations
Malmasi et al.
-
Large Language Models Encode Clinical Knowledge
(2022)
β’ Nature
β’ 1963 citations
Singhal et al.
-
M-SENA: An Integrated Platform For Multimodal Sentiment Analysis
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
β’ 58 citations
Mao et al.
-
ZSON: Zero-shot Object-goal Navigation Using Multimodal Goal Embeddings
(2022)
β’ Arxiv
β’ 41 citations
Majumdar et al.
-
Storydall-e: Adapting Pretrained Text-to-image Transformers For Story Continuation
(2022)
β’ Lecture Notes in Computer Science
β’ 47 citations
Adyasha Maharana, Darryl Hannan, Mohit Bansal
-
"this Is My Unicorn, Fluffy": Personalizing Frozen Vision-language Representations
(2022)
β’ Lecture Notes in Computer Science
β’ 42 citations
Cohen et al.
-
Clip-art: Contrastive Pre-training For Fine-grained Art Classification
(2022)
β’ 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
β’ 88 citations
Marcos V. Conde, Kerem Turgutlu
-
No Language Left Behind: Scaling Human-centered Machine Translation
(2022)
β’ Arxiv
β’ 354 citations
Team et al.
-
See Finer, See More: Implicit Modality Alignment For Text-based Person Retrieval
(2022)
β’ Lecture Notes in Computer Science
β’ 102 citations
Shu et al.
-
I2mvformer: Large Language Model Generated Multi-view Document Supervision For Zero-shot Image Classification
(2022)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 57 citations
Naeem et al.
-
A Survey On Legal Judgment Prediction: Datasets, Metrics, Models And Challenges
(2022)
β’ IEEE Access
β’ 46 citations
Cui et al.
-
Teaching Small Language Models To Reason
(2022)
β’ Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
β’ 45 citations
Magister et al.
-
Structured Pruning Learns Compact And Accurate Models
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 99 citations
Mengzhou Xia, Zexuan Zhong, Danqi Chen
-
MISC: A Mixed Strategy-aware Model Integrating COMET For Emotional Support Conversation
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 66 citations
Tu et al.
-
Improving The Factual Correctness Of Radiology Report Generation With Semantic Rewards
(2022)
β’ Findings of the Association for Computational Linguistics: EMNLP 2022
β’ 41 citations
Delbrouck et al.
-
COLD: A Benchmark For Chinese Offensive Language Detection
(2022)
β’ Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
β’ 66 citations
Deng et al.
-
Visual Speech Recognition For Multiple Languages In The Wild
(2022)
β’ Nature Machine Intelligence
β’ 130 citations
Pingchuan Ma, Stavros Petridis, Maja Pantic
-
Reading-strategy Inspired Visual Representation Learning For Text-to-video Retrieval
(2022)
β’ IEEE Transactions on Circuits and Systems for Video Technology
β’ 67 citations
Dong et al.
-
Teaching Structured Vision&language Concepts To Vision&language Models
(2022)
β’ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 41 citations
Doveh et al.
-
On The Origin Of Hallucinations In Conversational Models: Is It The Datasets Or The Models?
(2022)
β’ Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 101 citations
Dziri et al.
-
One Embedder, Any Task: Instruction-finetuned Text Embeddings
(2022)
β’ Findings of the Association for Computational Linguistics: ACL 2023
β’ 68 citations
Su et al.
-
Ontology-enhanced Prompt-tuning For Few-shot Learning
(2022)
β’ Proceedings of the ACM Web Conference 2022
β’ 57 citations
Ye et al.
-
Mintrec: A New Dataset For Multimodal Intent Recognition
(2022)
β’ Proceedings of the 30th ACM International Conference on Multimedia
β’ 41 citations
Zhang et al.
-
3D-SPS: Single-stage 3D Visual Grounding Via Referred Point Progressive Selection
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 64 citations
Luo et al.
-
Practical Program Repair In The Era Of Large Pre-trained Language Models
(2022)
β’ 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)
β’ 236 citations
Chunqiu Steven Xia, Yuxiang Wei, Lingming Zhang
-
DR-GAN: Distribution Regularization For Text-to-image Generation
(2022)
β’ IEEE Transactions on Neural Networks and Learning Systems
β’ 43 citations
Tan et al.
-
Mapping Global Dynamics Of Benchmark Creation And Saturation In Artificial Intelligence
(2022)
β’ Nature Communications
β’ 55 citations
Ott et al.
-
Dynatask: A Framework For Creating Dynamic AI Benchmark Tasks
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
β’ 40 citations
Thrush et al.
-
Document-level Relation Extraction With Adaptive Focal Loss And Knowledge Distillation
(2022)
β’ Findings of the Association for Computational Linguistics: ACL 2022
β’ 101 citations
Tan et al.
-
Transformer-based Language Models For Software Vulnerability Detection
(2022)
β’ ACSAC: Annual Computer Security Applications Conference
β’ 87 citations
Thapa et al.
-
Evaluating Mixed-initiative Conversational Search Systems Via User Simulation
(2022)
β’ Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining
β’ 42 citations
Ivan SekuliΔ, Mohammad Aliannejadi, Fabio Crestani
-
How To Keep Text Private? A Systematic Review Of Deep Learning Methods For Privacy-preserving Natural Language Processing
(2022)
β’ Artificial Intelligence Review
β’ 57 citations
Samuel Sousa, Roman Kern
-
Can Large Language Models Reason About Medical Questions?
(2022)
β’ Patterns
β’ 138 citations
LiΓ©vin et al.
-
Unified Structure Generation For Universal Information Extraction
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 345 citations
Lu et al.
-
Bridging Video-text Retrieval With Multiple Choice Questions
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 117 citations
Ge et al.
-
LAION-5B: An Open Large-scale Dataset For Training Next Generation Image-text Models
(2022)
β’ Arxiv
β’ 1032 citations
Schuhmann et al.
-
Zero-shot Text Classification With Self-training
(2022)
β’ Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
β’ 44 citations
Gera et al.
-
Leveraging Unimodal Self-supervised Learning For Multimodal Audio-visual Speech Recognition
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 42 citations
Pan et al.
-
Why Does Surprisal From Larger Transformer-based Language Models Provide A Poorer Fit To Human Reading Times?
(2022)
β’ Transactions of the Association for Computational Linguistics
β’ 51 citations
Byung-Doh Oh, William Schuler
-
A-OKVQA: A Benchmark For Visual Question Answering Using World Knowledge
(2022)
β’ Lecture Notes in Computer Science
β’ 162 citations
Schwenk et al.
-
ASQA: Factoid Questions Meet Long-form Answers
(2022)
β’ Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
β’ 43 citations
Stelmakh et al.
-
Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering
(2022)
β’ Arxiv
β’ 214 citations
Lu et al.
-
A Sequence-to-sequence Approach For Document-level Relation Extraction
(2022)
β’ Proceedings of the 21st Workshop on Biomedical Language Processing
β’ 50 citations
John Giorgi, Gary D. Bader, Bo Wang
-
Dynamic Prompt Learning Via Policy Gradient For Semi-structured Mathematical Reasoning
(2022)
β’ Arxiv
β’ 41 citations
Lu et al.
-
X-pool: Cross-modal Language-video Attention For Text-video Retrieval
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 170 citations
Gorti et al.
-
News Summarization And Evaluation In The Era Of GPT-3
(2022)
β’ Arxiv
β’ 180 citations
Tanya Goyal, Junyi Jessy Li, Greg Durrett
-
Unixcoder: Unified Cross-modal Pre-training For Code Representation
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 448 citations
Guo et al.
-
How Would Stance Detection Techniques Evolve After The Launch Of Chatgpt?
(2022)
β’ Arxiv
β’ 66 citations
Zhang et al.
-
Self-critiquing Models For Assisting Human Evaluators
(2022)
β’ Arxiv
β’ 46 citations
Saunders et al.
-
Speciesist Bias In AI -- How AI Applications Perpetuate Discrimination And Unfair Outcomes Against Animals
(2022)
β’ AI and Ethics
β’ 62 citations
Hagendorff et al.
-
Mucgec: A Multi-reference Multi-source Evaluation Dataset For Chinese Grammatical Error Correction
(2022)
β’ Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 51 citations
Zhang et al.
-
Temporal Alignment Networks For Long-term Video
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 53 citations
Tengda Han, Weidi Xie, Andrew Zisserman
-
Twhin-bert: A Socially-enriched Pre-trained Language Model For Multilingual Tweet Representations At Twitter
(2022)
β’ KDD '23: The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
β’ 43 citations
Zhang et al.
-
Fengshenbang 1.0: Being The Foundation Of Chinese Cognitive Intelligence
(2022)
β’ Arxiv
β’ 44 citations
Zhang et al.
-
Can Machines Help Us Answering Question 16 In Datasheets, And In Turn Reflecting On Inappropriate Content?
(2022)
β’ FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency
β’ 41 citations
Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
-
WANLI: Worker And AI Collaboration For Natural Language Inference Dataset Creation
(2022)
β’ Findings of the Association for Computational Linguistics: EMNLP 2022
β’ 102 citations
Liu et al.
-
Vitaev2: Vision Transformer Advanced By Exploring Inductive Bias For Image Recognition And Beyond
(2022)
β’ International Journal of Computer Vision
β’ 173 citations
Zhang et al.
-
Pile Of Law: Learning Responsible Data Filtering From The Law And A 256GB Open-source Legal Dataset
(2022)
β’ Arxiv
β’ 43 citations
Henderson et al.
-
Reclip: A Strong Zero-shot Baseline For Referring Expression Comprehension
(2022)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 94 citations
Subramanian et al.
-
Bridging The Gap Between Learning In Discrete And Continuous Environments For Vision-and-language Navigation
(2022)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 53 citations
Hong et al.
-
From Discrimination To Generation: Knowledge Graph Completion With Generative Transformer
(2022)
β’ WWW '22: The ACM Web Conference 2022
β’ 63 citations
Xie et al.
-
Unnatural Instructions: Tuning Language Models With (almost) No Human Labor
(2022)
β’ Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 44 citations
Honovich et al.
-
TRUE: Re-evaluating Factual Consistency Evaluation
(2022)
β’ Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering
β’ 47 citations
Honovich et al.
-
Graphmae: Self-supervised Masked Graph Autoencoders
(2022)
β’ Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
β’ 441 citations
Hou et al.
-
Context-aware Biaffine Localizing Network For Temporal Sentence Grounding
(2021)
β’ 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 128 citations
Liu et al.
-
Locate Then Segment: A Strong Pipeline For Referring Image Segmentation
(2021)
β’ 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 125 citations
Jing et al.
-
Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer
(2021)
β’ SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
β’ 116 citations
Liu et al.
-
Unit: Multimodal Multitask Learning With A Unified Transformer
(2021)
β’ 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 224 citations
Ronghang Hu, Amanpreet Singh
-
Signbert: Pre-training Of Hand-model-aware Representation For Sign Language Recognition
(2021)
β’ 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 89 citations
Hu et al.
-
Visually Grounded Reasoning Across Languages And Cultures
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 84 citations
Liu et al.
-
Consert: A Contrastive Framework For Self-supervised Sentence Representation Transfer
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 466 citations
Yan et al.
-
Reformer: The Relational Transformer For Image Captioning
(2021)
β’ MM '22: The 30th ACM International Conference on Multimedia
β’ 45 citations
Xuewen Yang, Yingru Liu, Xin Wang
-
Discriminative Triad Matching And Reconstruction For Weakly Referring Expression Grounding
(2021)
β’ IEEE Transactions on Pattern Analysis and Machine Intelligence
β’ 56 citations
Sun et al.
-
Efficient Attentions For Long Document Summarization
(2021)
β’ Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 121 citations
Huang et al.
-
Whiteningbert: An Easy Unsupervised Sentence Embedding Approach
(2021)
β’ Findings of the Association for Computational Linguistics: EMNLP 2021
β’ 70 citations
Huang et al.
-
Look Before You Leap: Learning Landmark Features For One-stage Visual Grounding
(2021)
β’ 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 111 citations
Huang et al.
-
Graph-enhanced Multi-task Learning Of Multi-level Transition Dynamics For Session-based Recommendation
(2021)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 95 citations
Huang et al.
-
Task-adaptive Neural Process For User Cold-start Recommendation
(2021)
β’ Proceedings of the Web Conference 2021
β’ 81 citations
Lin et al.
-
On The Evaluation Of Neural Code Summarization
(2021)
β’ Proceedings of the 44th International Conference on Software Engineering
β’ 64 citations
Shi et al.
-
Are NLP Models Really Able To Solve Simple Math Word Problems?
(2021)
β’ Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 66 citations
Arkil Patel, Satwik Bhattamishra, Navin Goyal
-
Psyqa: A Chinese Dataset For Generating Long Counseling Text For Mental Health Support
(2021)
β’ Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
β’ 47 citations
Sun et al.
-
Detecting Harmful Memes And Their Targets
(2021)
β’ Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
β’ 90 citations
Pramanick et al.
-
Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning
(2021)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 208 citations
Lin et al.
-
Memory Augmented Multi-instance Contrastive Predictive Coding For Sequential Recommendation
(2021)
β’ 2021 IEEE International Conference on Data Mining (ICDM)
β’ 45 citations
Ruihong Qiu, Zi Huang, Hongzhi Yin
-
TABBIE: Pretrained Representations Of Tabular Data
(2021)
β’ Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 110 citations
Iida et al.
-
TAPEX: Table Pre-training Via Learning A Neural SQL Executor
(2021)
β’ Arxiv
β’ 90 citations
Liu et al.
-
MOMENTA: A Multimodal Framework For Detecting Harmful Memes And Their Targets
(2021)
β’ Findings of the Association for Computational Linguistics: EMNLP 2021
β’ 100 citations
Pramanick et al.
-
Semantic Answer Similarity For Evaluating Question Answering Models
(2021)
β’ Proceedings of the 3rd Workshop on Machine Reading for Question Answering
β’ 43 citations
Risch et al.
-
Document-level Event Argument Extraction By Conditional Generation
(2021)
β’ Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 222 citations
Sha Li, Heng Ji, Jiawei Han
-
Towards Enhancing Fine-grained Details For Image Matting
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 160 citations
Chang Liu, Henghui Ding, Xudong Jiang
-
Scaling Up Visual And Vision-language Representation Learning With Noisy Text Supervision
(2021)
β’ International Conference on Machine Learning 2021
β’ 1191 citations
Jia et al.
-
Multimodal Emergent Fake News Detection Via Meta Neural Process Networks
(2021)
β’ Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
β’ 51 citations
Wang et al.
-
Pre-training BERT On Arabic Tweets: Practical Considerations
(2021)
β’ Arxiv
β’ 83 citations
Abdelali et al.
-
Empowering News Recommendation With Pre-trained Language Models
(2021)
β’ SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
β’ 122 citations
Wu et al.
-
Natural Language Understanding For Argumentative Dialogue Systems In The Opinion Building Domain
(2021)
β’ Knowledge-Based Systems
β’ 41 citations
Abro et al.
-
Pairwise Supervised Contrastive Learning Of Sentence Representations
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 41 citations
Zhang et al.
-
Muppet: Massive Multi-task Representations With Pre-finetuning
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 168 citations
Aghajanyan et al.
-
Arat5: Text-to-text Transformers For Arabic Language Generation
(2021)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 56 citations
El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-Mageed
-
Building And Evaluating Open-domain Dialogue Corpora With Clarifying Questions
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 57 citations
Aliannejadi et al.
-
GODIVA: Generating Open-domain Videos From Natural Descriptions
(2021)
β’ Arxiv
β’ 78 citations
Wu et al.
-
PASTE: A Tagging-free Decoding Framework Using Pointer Networks For Aspect Sentiment Triplet Extraction
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 49 citations
Mukherjee et al.
-
Esimcse: Enhanced Sample Building Method For Contrastive Learning Of Unsupervised Sentence Embedding
(2021)
β’ Arxiv
β’ 69 citations
Wu et al.
-
Docformer: End-to-end Transformer For Document Understanding
(2021)
β’ 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 200 citations
Appalaraju et al.
-
Datasets: A Community Library For Natural Language Processing
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
β’ 310 citations
Lhoest et al.
-
Geometry Attention Transformer With Position-aware Lstms For Image Captioning
(2021)
β’ Expert Systems with Applications
β’ 59 citations
Chi Wang, Yulin Shen, Luping Ji
-
Layoutreader: Pre-training Of Text And Layout For Reading Order Detection
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 46 citations
Wang et al.
-
Combat COVID-19 Infodemic Using Explainable Natural Language Processing Models
(2021)
β’ Information Processing & Management
β’ 146 citations
Jackie Ayoub, X. Jessie Yang, Feng Zhou
-
Describing And Localizing Multiple Changes With Transformers
(2021)
β’ 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 76 citations
Qiu et al.
-
Just Say No: Analyzing The Stance Of Neural Dialogue Generation In Offensive Contexts
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 48 citations
Baheti et al.
-
Cross-lingual Abstractive Summarization With Limited Parallel Resources
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 43 citations
Yu Bai, Yang Gao, Heyan Huang
-
Simple And Effective Zero-shot Cross-lingual Phoneme Recognition
(2021)
β’ Lecture Notes in Computer Science
β’ 155 citations
Qiantong Xu, Alexei Baevski, Michael Auli
-
Learning Transferable Visual Models From Natural Language Supervision
(2021)
β’ Arxiv
β’ 5297 citations
Radford et al.
-
Towards More Flexible And Accurate Object Tracking With Natural Language: Algorithms And Benchmark
(2021)
β’ 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 198 citations
Wang et al.
-
TSDAE: Using Transformer-based Sequential Denoising Auto-encoder For Unsupervised Sentence Embedding Learning
(2021)
β’ Findings of the Association for Computational Linguistics: EMNLP 2021
β’ 76 citations
Kexin Wang, Nils Reimers, Iryna Gurevych
-
Screen2words: Automatic Mobile UI Summarization With Multimodal Learning
(2021)
β’ The 34th Annual ACM Symposium on User Interface Software and Technology
β’ 78 citations
Wang et al.
-
Data Augmentation In Natural Language Processing: A Novel Text Generation Approach For Long And Short Text Classifiers
(2021)
β’ International Journal of Machine Learning and Cybernetics
β’ 127 citations
Bayer et al.
-
Multi-modal Sarcasm Detection And Humor Classification In Code-mixed Conversations
(2021)
β’ IEEE Transactions on Affective Computing
β’ 63 citations
Bedi et al.
-
Data Expansion Using Back Translation And Paraphrasing For Hate Speech Detection
(2021)
β’ Arxiv
β’ 73 citations
Djamila Romaissa Beddiar, Md Saroar Jahan, Mourad Oussalah
-
Few-shot Domain Adaptation For Grammatical Error Correction Via Meta-learning
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 64 citations
Zhang et al.
-
Banglabert: Language Model Pretraining And Benchmarks For Low-resource Language Understanding Evaluation In Bangla
(2021)
β’ Findings of the Association for Computational Linguistics: NAACL 2022
β’ 71 citations
Bhattacharjee et al.
-
Cycle-consistent Inverse GAN For Text-to-image Synthesis
(2021)
β’ Proceedings of the 29th ACM International Conference on Multimedia
β’ 44 citations
Wang et al.
-
Curriculum Pre-training Heterogeneous Subgraph Transformer For Top-\(n\) Recommendation
(2021)
β’ ACM Transactions on Information Systems
β’ 45 citations
Wang et al.
-
CLEVE: Contrastive Pre-training For Event Extraction
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 100 citations
Wang et al.
-
Crossclr: Cross-modal Contrastive Learning For Multi-modal Video Representations
(2021)
β’ 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 121 citations
Zolfaghari et al.
-
Everything At Once -- Multi-modal Fusion Transformer For Video Retrieval
(2021)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 112 citations
Shvetsova et al.
-
Quiz-style Question Generation For News Stories
(2021)
β’ Proceedings of the Web Conference 2021
β’ 41 citations
Adam D. Lelkes, Vinh Q. Tran, Cong Yu
-
Crslab: An Open-source Toolkit For Building Conversational Recommender System
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations
β’ 44 citations
Zhou et al.
-
Recursively Summarizing Books With Human Feedback
(2021)
β’ Arxiv
β’ 65 citations
Wu et al.
-
End-to-end Referring Video Object Segmentation With Multimodal Transformers
(2021)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 129 citations
Adam Botach, Evgenii Zheltonozhskii, Chaim Baskin
-
Metaicl: Learning To Learn In Context
(2021)
β’ Arxiv
β’ 61 citations
Min et al.
-
ASAP: A Chinese Review Dataset Towards Aspect Category Sentiment Analysis And Rating Prediction
(2021)
β’ Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 63 citations
Bu et al.
-
Indonlg: Benchmark And Resources For Evaluating Indonesian Natural Language Generation
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 68 citations
Cahyawijaya et al.
-
Out-of-scope Intent Detection With Self-supervision And Discriminative Training
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 46 citations
Zhan et al.
-
Disentangling Hate In Online Memes
(2021)
β’ Proceedings of the 29th ACM International Conference on Multimedia
β’ 73 citations
Cao et al.
-
Deduplicating Training Data Makes Language Models Better
(2021)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 119 citations
Lee et al.
-
Multieurlex -- A Multi-lingual And Multi-label Legal Document Classification Dataset For Zero-shot Cross-lingual Transfer
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 68 citations
Ilias Chalkidis, Manos Fergadiotis, Ion Androutsopoulos
-
Lexglue: A Benchmark Dataset For Legal Language Understanding In English
(2021)
β’ SSRN Electronic Journal
β’ 73 citations
Chalkidis et al.
-
Speechstew: Simply Mix All Available Speech Recognition Data To Train One Large Neural Network
(2021)
β’ Arxiv
β’ 75 citations
Chan et al.
-
Dialogsum: A Real-life Scenario Dialogue Summarization Dataset
(2021)
β’ Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
β’ 121 citations
Chen et al.
-
Bidirectional Machine Reading Comprehension For Aspect Sentiment Triplet Extraction
(2021)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 185 citations
Chen et al.
-
Knowprompt: Knowledge-aware Prompt-tuning With Synergistic Optimization For Relation Extraction
(2021)
β’ Proceedings of the ACM Web Conference 2022
β’ 330 citations
Chen et al.
-
Semantic And Syntactic Enhanced Aspect Sentiment Triplet Extraction
(2021)
β’ Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
β’ 60 citations
Chen et al.
-
Graph Based Network With Contextualized Representations Of Turns In Dialogue
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 40 citations
Bongseok Lee, Yong Suk Choi
-
Studying The Usage Of Text-to-text Transfer Transformer To Support Code-related Tasks
(2021)
β’ 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)
β’ 190 citations
Mastropaolo et al.
-
Deepcad: A Deep Generative Network For Computer-aided Design Models
(2021)
β’ 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 112 citations
Rundi Wu, Chang Xiao, Changxi Zheng
-
Swiss-judgment-prediction: A Multilingual Legal Judgment Prediction Benchmark
(2021)
β’ Proceedings of the Natural Legal Language Processing Workshop 2021
β’ 46 citations
Joel Niklaus, Ilias Chalkidis, Matthias StΓΌrmer
-
Instancerefer: Cooperative Holistic Understanding For Visual Grounding On Point Clouds Through Instance Multi-level Contextual Referring
(2021)
β’ 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 99 citations
Yuan et al.
-
Bartscore: Evaluating Generated Text As Text Generation
(2021)
β’ Arxiv
β’ 318 citations
Weizhe Yuan, Graham Neubig, Pengfei Liu
-
Evaluation Of BERT And ALBERT Sentence Embedding Performance On Downstream NLP Tasks
(2021)
β’ 2020 25th International Conference on Pattern Recognition (ICPR)
β’ 115 citations
Choi et al.
-
Planning With Learned Entity Prompts For Abstractive Summarization
(2021)
β’ Transactions of the Association for Computational Linguistics
β’ 92 citations
Narayan et al.
-
BERT-GT: Cross-sentence N-ary Relation Extraction With BERT And Graph Transformer
(2021)
β’ Bioinformatics
β’ 49 citations
Po-Ting Lai, Zhiyong Lu
-
Layoutxlm: Multimodal Pre-training For Multilingual Visually-rich Document Understanding
(2021)
β’ Arxiv
β’ 48 citations
Xu et al.
-
Learning Modality-specific Representations With Self-supervised Multi-task Learning For Multimodal Sentiment Analysis
(2021)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 555 citations
Yu et al.
-
TEACHTEXT: Crossmodal Generalized Distillation For Text-video Retrieval
(2021)
β’ 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 117 citations
Croitoru et al.
-
Multimodal End-to-end Sparse Model For Emotion Recognition
(2021)
β’ Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 70 citations
Dai et al.
-
Masked Language Modeling And The Distributional Hypothesis: Order Word Matters Pre-training For Little
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 62 citations
Sinha et al.
-
Deepxml: A Deep Extreme Multi-label Learning Framework Applied To Short Text Documents
(2021)
β’ Proceedings of the 14th ACM International Conference on Web Search and Data Mining
β’ 60 citations
Dahiya et al.
-
Does Syntax Matter? A Strong Baseline For Aspect-based Sentiment Analysis With Roberta
(2021)
β’ Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 155 citations
Dai et al.
-
Beyond Goldfish Memory: Long-term Open-domain Conversation
(2021)
β’ Arxiv
β’ 40 citations
Jing Xu, Arthur Szlam, Jason Weston
-
Increasing Faithfulness In Knowledge-grounded Dialogue With Controllable Features
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 69 citations
Rashkin et al.
-
Case-based Reasoning For Natural Language Queries Over Knowledge Bases
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 99 citations
Das et al.
-
A Dataset Of Information-seeking Questions And Answers Anchored In Research Papers
(2021)
β’ Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 86 citations
Dasigi et al.
-
Quality At A Glance: An Audit Of Web-crawled Multilingual Datasets
(2021)
β’ Transactions of the Association for Computational Linguistics
β’ 155 citations
Kreutzer et al.
-
Entity Structure Within And Throughout: Modeling Mention Dependencies For Document-level Relation Extraction
(2021)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 167 citations
Xu et al.
-
Docnli: A Large-scale Dataset For Document-level Natural Language Inference
(2021)
β’ Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
β’ 51 citations
Wenpeng Yin, Dragomir Radev, Caiming Xiong
-
BOLD: Dataset And Metrics For Measuring Biases In Open-ended Language Generation
(2021)
β’ FAccT '21: 2021 ACM Conference on Fairness, Accountability, and Transparency
β’ 125 citations
Dhamala et al.
-
Time-aware Language Models As Temporal Knowledge Bases
(2021)
β’ Transactions of the Association for Computational Linguistics
β’ 49 citations
Dhingra et al.
-
Similarity Reasoning And Filtration For Image-text Matching
(2021)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 319 citations
Diao et al.
-
Few-nerd: A Few-shot Named Entity Recognition Dataset
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 184 citations
Ding et al.
-
Transferable Dialogue Systems And User Simulators
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 40 citations
Tseng et al.
-
SIMMC 2.0: A Task-oriented Dialog Dataset For Immersive Multimodal Conversations
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 51 citations
Kottur et al.
-
Cross-lingual COVID-19 Fake News Detection
(2021)
β’ 2021 International Conference on Data Mining Workshops (ICDMW)
β’ 40 citations
Du et al.
-
Plan-then-generate: Controlled Data-to-text Generation Via Planning
(2021)
β’ Findings of the Association for Computational Linguistics: EMNLP 2021
β’ 57 citations
Su et al.
-
Clip4caption ++: Multi-clip For Video Caption
(2021)
β’ Proceedings of the 29th ACM International Conference on Multimedia
β’ 113 citations
Tang et al.
-
Decoupling The Role Of Data, Attention, And Losses In Multimodal Transformers
(2021)
β’ Transactions of the Association for Computational Linguistics
β’ 63 citations
Hendricks et al.
-
CUAD: An Expert-annotated NLP Dataset For Legal Contract Review
(2021)
β’ Arxiv
β’ 95 citations
Hendrycks et al.
-
Measuring Mathematical Problem Solving With The MATH Dataset
(2021)
β’ Arxiv
β’ 268 citations
Hendrycks et al.
-
Object-region Video Transformers
(2021)
β’ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 62 citations
Herzig et al.
-
Unlocking Compositional Generalization In Pre-trained Models Using Intermediate Representations
(2021)
β’ Arxiv
β’ 51 citations
Herzig et al.
-
MDETR -- Modulated Detection For End-to-end Multi-modal Understanding
(2021)
β’ 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 594 citations
Kamath et al.
-
Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation
(2021)
β’ Proceedings of the 29th ACM International Conference on Multimedia
β’ 153 citations
Zaid Khan, Yun Fu
-
Matscibert: A Materials Domain Language Model For Text Mining And Information Extraction
(2021)
β’ npj Computational Materials
β’ 200 citations
Gupta et al.
-
Jointgt: Graph-text Joint Representation Learning For Text Generation From Knowledge Graphs
(2021)
β’ Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
β’ 77 citations
Ke et al.
-
CAVER: Cross-modal View-mixed Transformer For Bi-modal Salient Object Detection
(2021)
β’ IEEE Transactions on Image Processing
β’ 138 citations
Pang et al.
-
Audioclip: Extending CLIP To Image, Text And Audio
(2021)
β’ ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
β’ 210 citations
Guzhov et al.
-
Asvspoof 2021: Accelerating Progress In Spoofed And Deepfake Speech Detection
(2021)
β’ 2021 Edition of the Automatic Speaker Verification and Spoofing Countermeasures Challenge
β’ 268 citations
Yamagishi et al.
-
Vision Transformers For Weeds And Crops Classification Of High Resolution UAV Images
(2021)
β’ Remote Sensing
β’ 167 citations
Reedha et al.
-
Meta-learning Adversarial Domain Adaptation Network For Few-shot Text Classification
(2021)
β’ Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
β’ 54 citations
Han et al.
-
Exploring Task Difficulty For Few-shot Relation Extraction
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 71 citations
Jiale Han, Bo Cheng, Wei Lu
-
Knowledge-enhanced Hierarchical Graph Transformer Network For Multi-behavior Recommendation
(2021)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 179 citations
Xia et al.
-
Multiplex Behavioral Relation Learning For Recommendation Via Memory Augmented Transformer Network
(2021)
β’ SIGIR '20: The 43rd International ACM SIGIR conference on research and development in Information Retrieval
β’ 129 citations
Xia et al.
-
E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks
(2021)
β’ 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 54 citations
Kayser et al.
-
KLUE: Korean Language Understanding Evaluation
(2021)
β’ Arxiv
β’ 78 citations
Park et al.
-
Open-book Video Captioning With Retrieve-copy-generate Network
(2021)
β’ 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 100 citations
Zhang et al.
-
A Unified Generative Framework For Aspect-based Sentiment Analysis
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 256 citations
Yan et al.
-
Xl-sum: Large-scale Multilingual Abstractive Summarization For 44 Languages
(2021)
β’ Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
β’ 179 citations
Hasan et al.
-
Roformer: Enhanced Transformer With Rotary Position Embedding
(2021)
β’ Neurocomputing
β’ 830 citations
Su et al.
-
Transrefer3d: Entity-and-relation Aware Transformer For Fine-grained 3D Visual Grounding
(2021)
β’ MM '21: ACM Multimedia Conference
β’ 65 citations
He et al.
-
GALAXY: A Generative Pre-trained Model For Task-oriented Dialog With Semi-supervised Learning And Explicit Policy Injection
(2021)
β’ Arxiv
β’ 45 citations
He et al.
-
Multitask Prompted Training Enables Zero-shot Task Generalization
(2021)
β’ Arxiv
β’ 558 citations
Sanh et al.
-
Does CLIP Benefit Visual Question Answering In The Medical Domain As Much As It Does In The General Domain?
(2021)
β’ Arxiv
β’ 41 citations
Sedigheh Eslami, Gerard de Melo, Christoph Meinel
-
Clip2video: Mastering Video-text Retrieval Via Image CLIP
(2021)
β’ Arxiv
β’ 130 citations
Fang et al.
-
SATAR: A Self-supervised Approach To Twitter Account Representation Learning And Its Application In Bot Detection
(2021)
β’ CIKM '21: The 30th ACM International Conference on Information and Knowledge Management
β’ 60 citations
Feng et al.
-
An Improved Baseline For Sentence-level Relation Extraction
(2021)
β’ Arxiv
β’ 49 citations
Wenxuan Zhou, Muhao Chen
-
Structext: Structured Text Understanding With Multi-modal Transformers
(2021)
β’ MM '21: ACM Multimedia Conference
β’ 89 citations
Li et al.
-
Cross-modal Contrastive Learning For Text-to-image Generation
(2021)
β’ 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 297 citations
Zhang et al.
-
Guided Generation Of Cause And Effect
(2021)
β’ Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
β’ 47 citations
Li et al.
-
Trocr: Transformer-based Optical Character Recognition With Pre-trained Models
(2021)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 247 citations
Li et al.
-
Supervision Exists Everywhere: A Data Efficient Contrastive Language-image Pre-training Paradigm
(2021)
β’ Arxiv
β’ 126 citations
Li et al.
-
Aspect Sentiment Quad Prediction As Paraphrase Generation
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 174 citations
Zhang et al.
-
Keeping It Simple: Language Models Can Learn Complex Molecular Distributions
(2021)
β’ Nature Communications
β’ 129 citations
Daniel Flam-Shepherd, Kevin Zhu, AlΓ‘n Aspuru-Guzik
-
Adversarial Text-to-image Synthesis: A Review
(2021)
β’ Neural Networks
β’ 193 citations
Frolov et al.
-
Attend What You Need: Motion-appearance Synergistic Networks For Video Question Answering
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 57 citations
Seo et al.
-
Bigssl: Exploring The Frontier Of Large-scale Semi-supervised Learning For Automatic Speech Recognition
(2021)
β’ IEEE Journal of Selected Topics in Signal Processing
β’ 148 citations
Zhang et al.
-
Towards Robustness Of Text-to-sql Models Against Synonym Substitution
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 58 citations
Gan et al.
-
Open Aspect Target Sentiment Classification With Natural Language Prompts
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 43 citations
Seoh et al.
-
Clip4clip: An Empirical Study Of CLIP For End To End Video Clip Retrieval
(2021)
β’ Arxiv
β’ 113 citations
Luo et al.
-
Newsclippings: Automatic Generation Of Out-of-context Multimodal Media
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 46 citations
Grace Luo, Trevor Darrell, Anna Rohrbach
-
Simcse: Simple Contrastive Learning Of Sentence Embeddings
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 2273 citations
Tianyu Gao, Xingcheng Yao, Danqi Chen
-
Unsupervised Corpus Aware Language Model Pre-training For Dense Passage Retrieval
(2021)
β’ Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 88 citations
Luyu Gao, Jamie Callan
-
Topic-driven And Knowledge-aware Transformer For Dialogue Emotion Detection
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 94 citations
Zhu et al.
-
Competency Problems: On Finding And Removing Artifacts In Language Data
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 68 citations
Gardner et al.
-
Crossfit: A Few-shot Learning Challenge For Cross-task Generalization In NLP
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 100 citations
Qinyuan Ye, Bill Yuchen Lin, Xiang Ren
-
End-to-end Speech Translation Via Cross-modal Progressive Training
(2021)
β’ Interspeech 2021
β’ 42 citations
Rong Ye, Mingxuan Wang, Lei Li
-
INVIGORATE: Interactive Visual Grounding And Grasping In Clutter
(2021)
β’ Robotics: Science and Systems XVII
β’ 45 citations
Zhang et al.
-
LAION-400M: Open Dataset Of Clip-filtered 400 Million Image-text Pairs
(2021)
β’ Arxiv
β’ 366 citations
Schuhmann et al.
-
Synthesis Of Compositional Animations From Textual Descriptions
(2021)
β’ 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 140 citations
Ghosh et al.
-
Image Retrieval On Real-life Images With Pre-trained Vision-and-language Models
(2021)
β’ 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 151 citations
Liu et al.
-
TIMEDIAL: Temporal Commonsense Reasoning In Dialog
(2021)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 40 citations
Qin et al.
-
Generating Datasets With Pretrained Language Models
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 40 citations
Timo Schick, Hinrich SchΓΌtze
-
Visualmrc: Machine Reading Comprehension On Document Images
(2021)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 62 citations
Ryota Tanaka, Kyosuke Nishida, Sen Yoshida
-
Hate Towards The Political Opponent: A Twitter Corpus Study Of The 2020 US Elections On The Basis Of Offensive Speech And Stance Detection
(2021)
β’ Arxiv
β’ 41 citations
Lara Grimminger, Roman Klinger
-
The Multimodal Sentiment Analysis In Car Reviews (muse-car) Dataset: Collection, Insights And Improvements
(2021)
β’ IEEE Transactions on Affective Computing
β’ 57 citations
Stappen et al.
-
Open-vocabulary Object Detection Via Vision And Language Knowledge Distillation
(2021)
β’ ICLR 2022
β’ 280 citations
Gu et al.
-
Airbert: In-domain Pretraining For Vision-and-language Navigation
(2021)
β’ 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 118 citations
Guhur et al.
-
BENDR: Using Transformers And A Contrastive Self-supervised Learning Task To Learn From Massive Amounts Of EEG Data
(2021)
β’ Frontiers in Human Neuroscience
β’ 172 citations
Demetres Kostas, Stephane Aroca-Ouellette, Frank Rudzicz
-
End-to-end Audio-visual Speech Recognition With Conformers
(2021)
β’ ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
β’ 202 citations
Pingchuan Ma, Stavros Petridis, Maja Pantic
-
Transnas-bench-101: Improving Transferability And Generalizability Of Cross-task Neural Architecture Search
(2021)
β’ 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 41 citations
Duan et al.
-
Indobertweet: A Pretrained Language Model For Indonesian Twitter With Effective Domain-specific Vocabulary Initialization
(2021)
β’ Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
β’ 45 citations
Fajri Koto, Jey Han Lau, Timothy Baldwin
-
Sub-instruction Aware Vision-and-language Navigation
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 53 citations
Hong et al.
-
Fastbert: A Self-distilling BERT With Adaptive Inference Time
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 245 citations
Liu et al.
-
Sequence-level Mixed Sample Data Augmentation
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 68 citations
Demi Guo, Yoon Kim, Alexander M. Rush
-
XGLUE: A New Benchmark Dataset For Cross-lingual Pre-training, Understanding And Generation
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 58 citations
Liang et al.
-
Low Rank Fusion Based Transformers For Multimodal Sequences
(2020)
β’ Second Grand-Challenge and Workshop on Multimodal Language (Challenge-HML)
β’ 53 citations
Sahay et al.
-
BOND: Bert-assisted Open-domain Named Entity Recognition With Distant Supervision
(2020)
β’ Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
β’ 113 citations
Liang et al.
-
LAMBERT: Layout-aware (language) Modeling For Information Extraction
(2020)
β’ Lecture Notes in Computer Science
β’ 84 citations
Garncarek et al.
-
Babywalk: Going Farther In Vision-and-language Navigation By Taking Baby Steps
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 59 citations
Zhu et al.
-
GREEK-BERT: The Greeks Visiting Sesame Street
(2020)
β’ 11th Hellenic Conference on Artificial Intelligence
β’ 79 citations
Koutsikakis et al.
-
Coarse-to-fine Pre-training For Named Entity Recognition
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 49 citations
Xue et al.
-
A Comparison Of LSTM And BERT For Small Corpus
(2020)
β’ Arxiv
β’ 66 citations
Aysu Ezen-Can
-
Tweepfake: About Detecting Deepfake Tweets
(2020)
β’ PLOS ONE
β’ 153 citations
Fagni et al.
-
A Novel Graph-based Multi-modal Fusion Encoder For Neural Machine Translation
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 139 citations
Yin et al.
-
Unifiedqa: Crossing Format Boundaries With A Single QA System
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 51 citations
Khashabi et al.
-
Multi-dialect Arabic BERT For Country-level Dialect Identification
(2020)
β’ Arxiv
β’ 45 citations
Talafha et al.
-
Tabert: Pretraining For Joint Understanding Of Textual And Tabular Data
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 380 citations
Yin et al.
-
Meta-cotgan: A Meta Cooperative Training Paradigm For Improving Adversarial Text Generation
(2020)
β’ Arxiv
β’ 62 citations
Yin et al.
-
Parsbert: Transformer-based Model For Persian Language Understanding
(2020)
β’ Neural Processing Letters
β’ 111 citations
Farahani et al.
-
A Contextual Hierarchical Attention Network With Adaptive Objective For Dialogue State Tracking
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 57 citations
Shan et al.
-
Doc2dial: A Goal-oriented Document-grounded Dialogue Dataset
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 84 citations
Feng et al.
-
Aligning AI With Shared Human Values
(2020)
β’ Arxiv
β’ 100 citations
Hendrycks et al.
-
Parallel Data Augmentation For Formality Style Transfer
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 67 citations
Yi Zhang, Tao Ge, Xu Sun
-
Towards Automated Neural Interaction Discovery For Click-through Rate Prediction
(2020)
β’ Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
β’ 61 citations
Song et al.
-
Pretrained Transformers Improve Out-of-distribution Robustness
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 104 citations
Hendrycks et al.
-
Weakly-supervised Multi-level Attentional Reconstruction Network For Grounding Textual Queries In Videos
(2020)
β’ Arxiv
β’ 51 citations
Song et al.
-
Leakage-adjusted Simulatability: Can Models Generate Non-trivial Explanations Of Their Behavior In Natural Language?
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 55 citations
Hase et al.
-
Human-centric Spatio-temporal Video Grounding With Visual Transformers
(2020)
β’ IEEE Transactions on Circuits and Systems for Video Technology
β’ 75 citations
Tang et al.
-
Domain-specific Language Model Pretraining For Biomedical Natural Language Processing
(2020)
β’ ACM Transactions on Computing for Healthcare
β’ 915 citations
Gu et al.
-
Kdconv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 101 citations
Zhou et al.
-
Injecting Numerical Reasoning Skills Into Language Models
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 53 citations
Mor Geva, Ankit Gupta, Jonathan Berant
-
Ctrlsum: Towards Generic Controllable Text Summarization
(2020)
β’ Arxiv
β’ 50 citations
He et al.
-
Contrastive Triple Extraction With Generative Transformer
(2020)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 72 citations
Ye et al.
-
Machine Reading Comprehension: The Role Of Contextualized Language Models And Beyond
(2020)
β’ Arxiv
β’ 48 citations
Zhuosheng Zhang, Hai Zhao, Rui Wang
-
Transformer Networks For Trajectory Forecasting
(2020)
β’ 2020 25th International Conference on Pattern Recognition (ICPR)
β’ 336 citations
Giuliari et al.
-
Dynamic And Static Context-aware LSTM For Multi-agent Motion Prediction
(2020)
β’ Lecture Notes in Computer Science
β’ 45 citations
Tao et al.
-
INSPIRED: Toward Sociable Recommendation Dialog Systems
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 95 citations
Hayati et al.
-
A Large Dataset Of Historical Japanese Documents With Complex Layouts
(2020)
β’ 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
β’ 52 citations
Zejiang Shen, Kaixuan Zhang, Melissa Dell
-
Creating Something From Nothing: Unsupervised Knowledge Distillation For Cross-modal Hashing
(2020)
β’ 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 127 citations
Hu et al.
-
Rethinking Generalization Of Neural Models: A Named Entity Recognition Case Study
(2020)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 72 citations
Fu et al.
-
LUKE: Deep Contextualized Entity Representations With Entity-aware Self-attention
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 536 citations
Yamada et al.
-
Learning To Discretely Compose Reasoning Module Networks For Video Captioning
(2020)
β’ Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
β’ 59 citations
Tan et al.
-
Learning From Others' Mistakes: Avoiding Dataset Biases Without Modeling Them
(2020)
β’ Arxiv
β’ 51 citations
Sanh et al.
-
News Recommender System: A Review Of Recent Progress, Challenges, And Opportunities
(2020)
β’ Artificial Intelligence Review
β’ 138 citations
Shaina Raza, Chen Ding
-
Language-guided Navigation Via Cross-modal Grounding And Alternate Adversarial Learning
(2020)
β’ IEEE Transactions on Circuits and Systems for Video Technology
β’ 61 citations
Zhang et al.
-
Compositional Generalization In Semantic Parsing: Pre-training Vs. Specialized Architectures
(2020)
β’ Arxiv
β’ 74 citations
Furrer et al.
-
Where Does It Exist: Spatio-temporal Video Grounding For Multi-form Sentences
(2020)
β’ 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 94 citations
Zhang et al.
-
POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 66 citations
Zhang et al.
-
CLUE: A Chinese Language Understanding Evaluation Benchmark
(2020)
β’ Proceedings of the 28th International Conference on Computational Linguistics
β’ 235 citations
Xu et al.
-
Totto: A Controlled Table-to-text Generation Dataset
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 69 citations
Parikh et al.
-
Self-training For End-to-end Speech Translation
(2020)
β’ Interspeech 2020
β’ 40 citations
Pino et al.
-
From Machine Reading Comprehension To Dialogue State Tracking: Bridging The Gap
(2020)
β’ Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI
β’ 49 citations
Gao et al.
-
Generating Question Titles For Stack Overflow From Mined Code Snippets
(2020)
β’ ACM Transactions on Software Engineering and Methodology
β’ 55 citations
Gao et al.
-
A Co-interactive Transformer For Joint Slot Filling And Intent Detection
(2020)
β’ ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
β’ 89 citations
Qin et al.
-
Semi-supervised Neural Architecture Search
(2020)
β’ Arxiv
β’ 43 citations
Luo et al.
-
Multi-task Collaborative Network For Joint Referring Expression Comprehension And Segmentation
(2020)
β’ 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 268 citations
Luo et al.
-
A Dataset And Baselines For Visual Question Answering On Art
(2020)
β’ Lecture Notes in Computer Science
β’ 49 citations
Garcia et al.
-
Widget Captioning: Generating Natural Language Description For Mobile User Interface Elements
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 42 citations
Li et al.
-
Evaluating Models' Local Decision Boundaries Via Contrast Sets
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 252 citations
Gardner et al.
-
Generative Data Augmentation For Commonsense Reasoning
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 92 citations
Yang et al.
-
On The Potential Of Lexico-logical Alignments For Semantic Parsing To SQL Queries
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 41 citations
Shi et al.
-
Improving Massively Multilingual Neural Machine Translation And Zero-shot Translation
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 208 citations
Zhang et al.
-
Demographics Should Not Be The Reason Of Toxicity: Mitigating Discrimination In Text Classifications With Instance Weighting
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 55 citations
Zhang et al.
-
Reasoning With Latent Structure Refinement For Document-level Relation Extraction
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 280 citations
Nan et al.
-
OCNLI: Original Chinese Natural Language Inference
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 71 citations
Hu et al.
-
M3P: Learning Universal Representations Via Multitask Multilingual Multimodal Pre-training
(2020)
β’ Arxiv
β’ 43 citations
Ni et al.
-
ARBERT & MARBERT: Deep Bidirectional Transformers For Arabic
(2020)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 217 citations
Muhammad Abdul-Mageed, Abdelrahim Elmadany, El Moatez Billah Nagoudi
-
Improving Coreference Resolution By Leveraging Entity-centric Features With Graph Neural Networks And Second-order Inference
(2020)
β’ Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
β’ 112 citations
Liu et al.
-
Efficient Second-order Treecrf For Neural Dependency Parsing
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 88 citations
Yu Zhang, Zhenghua Li, Min Zhang
-
Neural CRF Model For Sentence Alignment In Text Simplification
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 97 citations
Jiang et al.
-
Diagnosing The Environment Bias In Vision-and-language Navigation
(2020)
β’ Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
β’ 44 citations
Yubo Zhang, Hao Tan, Mohit Bansal
-
Knowledge Graph Based Synthetic Corpus Generation For Knowledge-enhanced Language Model Pre-training
(2020)
β’ Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 49 citations
Agarwal et al.
-
Crows-pairs: A Challenge Dataset For Measuring Social Biases In Masked Language Models
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 148 citations
Nangia et al.
-
BERT-XML: Large Scale Automated ICD Coding Using BERT Pretraining
(2020)
β’ Proceedings of the 3rd Clinical Natural Language Processing Workshop
β’ 73 citations
Zachariah Zhang, Jingshu Liu, Narges Razavian
-
TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 203 citations
Wu et al.
-
Unitrans: Unifying Model Transfer And Data Transfer For Cross-lingual Named Entity Recognition With Unlabeled Data
(2020)
β’ Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
β’ 45 citations
Wu et al.
-
Machine Generation And Detection Of Arabic Manipulated And Fake News
(2020)
β’ Arxiv
β’ 41 citations
Nagoudi et al.
-
ETC: Encoding Long And Structured Inputs In Transformers
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 103 citations
Ainslie et al.
-
Fighting The COVID-19 Infodemic: Modeling The Perspective Of Journalists, Fact-checkers, Social Media Platforms, Policy Makers, And The Society
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2021
β’ 61 citations
Alam et al.
-
Stereoset: Measuring Stereotypical Bias In Pretrained Language Models
(2020)
β’ Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
β’ 224 citations
Moin Nadeem, Anna Bethke, Siva Reddy
-
Hover: A Dataset For Many-hop Fact Extraction And Claim Verification
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 83 citations
Jiang et al.
-
Covid-twitter-bert: A Natural Language Processing Model To Analyse COVID-19 Content On Twitter
(2020)
β’ Arxiv
β’ 134 citations
Martin MΓΌller, Marcel SalathΓ©, Per E Kummervold
-
Fashion Captioning: Towards Generating Accurate Descriptions With Semantic Rewards
(2020)
β’ Lecture Notes in Computer Science
β’ 61 citations
Yang et al.
-
Grid Tagging Scheme For Aspect-oriented Fine-grained Opinion Extraction
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 200 citations
Wu et al.
-
Uncertainty-aware Self-training For Text Classification With Few Labels
(2020)
β’ Arxiv
β’ 41 citations
Subhabrata Mukherjee, Ahmed Hassan Awadallah
-
Logic-guided Data Augmentation And Regularization For Consistent Question Answering
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 59 citations
Akari Asai, Hannaneh Hajishirzi
-
Inltk: Natural Language Toolkit For Indic Languages
(2020)
β’ Proceedings of Second Workshop for NLP Open Source Software (NLP-OSS)
β’ 42 citations
Gaurav Arora
-
Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder For Long-form Document Matching
(2020)
β’ CIKM '20: The 29th ACM International Conference on Information and Knowledge Management
β’ 53 citations
Yang et al.
-
Imagebert: Cross-modal Pre-training With Large-scale Weak-supervised Image-text Data
(2020)
β’ Arxiv
β’ 154 citations
Qi et al.
-
Textattack: A Framework For Adversarial Attacks, Data Augmentation, And Adversarial Training In NLP
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
β’ 245 citations
Morris et al.
-
Stereotypical Bias Removal For Hate Speech Detection Task Using Knowledge-based Generalizations
(2020)
β’ The World Wide Web Conference
β’ 89 citations
Pinkesh Badjatiya, Manish Gupta, Vasudeva Varma
-
Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training
(2020)
β’ Arxiv
β’ 83 citations
Qi et al.
-
Referring Expression Comprehension: A Survey Of Methods And Datasets
(2020)
β’ IEEE Transactions on Multimedia
β’ 81 citations
Yanyuan Qiao, Chaorui Deng, Qi Wu
-
Towards Conversational Recommendation Over Multi-type Dialogs
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 151 citations
Liu et al.
-
Graphdialog: Integrating Graph Knowledge Into End-to-end Task-oriented Dialogue Systems
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 46 citations
Shiquan Yang, Rui Zhang, Sarah Erfani
-
Overview Of Checkthat! 2020: Automatic Identification And Verification Of Claims In Social Media
(2020)
β’ Lecture Notes in Computer Science
β’ 80 citations
Barron-Cedeno et al.
-
Investigating Pretrained Language Models For Graph-to-text Generation
(2020)
β’ Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI
β’ 43 citations
Ribeiro et al.
-
Toxic Language Detection In Social Media For Brazilian Portuguese: New Dataset And Multilingual Analysis
(2020)
β’ Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing
β’ 45 citations
Leite et al.
-
Latent Opinions Transfer Network For Target-oriented Opinion Words Extraction
(2020)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 90 citations
Wu et al.
-
TED: A Pretrained Unsupervised Summarization Model With Theme Modeling And Denoising
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 43 citations
Yang et al.
-
Semeval-2020 Task 4: Commonsense Validation And Explanation
(2020)
β’ Proceedings of the Fourteenth Workshop on Semantic Evaluation
β’ 89 citations
Wang et al.
-
Deep Entity Matching With Pre-trained Language Models
(2020)
β’ Proceedings of the VLDB Endowment
β’ 246 citations
Li et al.
-
Adversarial Filters Of Dataset Biases
(2020)
β’ Arxiv
β’ 125 citations
Bras et al.
-
A Large-scale Chinese Short-text Conversation Dataset
(2020)
β’ Lecture Notes in Computer Science
β’ 99 citations
Wang et al.
-
Syntactic Data Augmentation Increases Robustness To Inference Heuristics
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 143 citations
Min et al.
-
A Survey On Machine Reading Comprehension: Tasks, Evaluation Metrics And Benchmark Datasets
(2020)
β’ Applied Sciences
β’ 61 citations
Zeng et al.
-
Pre-training Graph Transformer With Multimodal Side Information For Recommendation
(2020)
β’ MM '21: ACM Multimedia Conference
β’ 63 citations
Liu et al.
-
MART: Memory-augmented Recurrent Transformer For Coherent Video Paragraph Captioning
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 121 citations
Lei et al.
-
DIET: Lightweight Language Understanding For Dialogue Systems
(2020)
β’ Arxiv
β’ 112 citations
Bunk et al.
-
What Is More Likely To Happen Next? Video-and-language Future Event Prediction
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 41 citations
Lei et al.
-
Data Manipulation: Towards Effective Instance Learning For Neural Dialogue Generation Via Learning To Augment And Reweight
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 56 citations
Cai et al.
-
Cat-gen: Improving Robustness In NLP Models Via Controlled Adversarial Text Generation
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 62 citations
Wang et al.
-
Factual Error Correction For Abstractive Summarization Models
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 118 citations
Cao et al.
-
Expertise Style Transfer: A New Task Towards Better Communication Between Experts And Laymen
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 50 citations
Cao et al.
-
With Little Power Comes Great Responsibility
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 73 citations
Card et al.
-
Multiwoz 2.2 : A Dialogue Dataset With Additional Annotation Corrections And State Tracking Baselines
(2020)
β’ Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI
β’ 165 citations
Zang et al.
-
Hatebert: Retraining BERT For Abusive Language Detection In English
(2020)
β’ Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)
β’ 60 citations
Caselli et al.
-
Mitigating Gender Bias For Neural Dialogue Generation With Adversarial Learning
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 52 citations
Liu et al.
-
Exclusive Hierarchical Decoding For Deep Keyphrase Generation
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 62 citations
Chen et al.
-
Artificial Intelligence (AI) In Action: Addressing The COVID-19 Pandemic With Natural Language Processing (NLP)
(2020)
β’ Annual Review of Biomedical Data Science
β’ 56 citations
Chen et al.
-
Adaptive Offline Quintuplet Loss For Image-text Matching
(2020)
β’ Lecture Notes in Computer Science
β’ 67 citations
Tianlang Chen, Jiajun Deng, Jiebo Luo
-
Cops-ref: A New Dataset And Task On Compositional Referring Expression Comprehension
(2020)
β’ 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 51 citations
Chen et al.
-
IMRAM: Iterative Matching With Recurrent Attention Memory For Cross-modal Image-text Retrieval
(2020)
β’ 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 391 citations
Chen et al.
-
Hybridqa: A Dataset Of Multi-hop Question Answering Over Tabular And Textual Data
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 178 citations
Chen et al.
-
Logic2text: High-fidelity Natural Language Generation From Logical Forms
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 49 citations
Chen et al.
-
Learning Modality Interaction For Temporal Sentence Localization And Event Captioning In Videos
(2020)
β’ Lecture Notes in Computer Science
β’ 89 citations
Chen et al.
-
Local Additivity Based Data Augmentation For Semi-supervised NER
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 45 citations
Chen et al.
-
Few-shot Natural Language Generation For Task-oriented Dialog
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 157 citations
Peng et al.
-
AUTSL: A Large Scale Multi-modal Turkish Sign Language Dataset And Baseline Methods
(2020)
β’ IEEE Access
β’ 200 citations
Ozge Mercanoglu Sincan, Hacer Yalim Keles
-
Graph-structured Referring Expression Reasoning In The Wild
(2020)
β’ 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 96 citations
Sibei Yang, Guanbin Li, Yizhou Yu
-
What Can We Learn From Collective Human Opinions On Natural Language Inference Data?
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 77 citations
Yixin Nie, Xiang Zhou, Mohit Bansal
-
Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue
(2020)
β’ Arxiv
β’ 97 citations
Shikib Mehri, Mihail Eric, Dilek Hakkani-Tur
-
HERO: Hierarchical Encoder For Video+language Omni-representation Pre-training
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 372 citations
Li et al.
-
A Benchmark For Systematic Generalization In Grounded Language Understanding
(2020)
β’ Arxiv
β’ 45 citations
Ruis et al.
-
VIOLIN: A Large-scale Dataset For Video-and-language Inference
(2020)
β’ 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 56 citations
Liu et al.
-
Room-across-room: Multilingual Vision-and-language Navigation With Dense Spatiotemporal Grounding
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 190 citations
Ku et al.
-
Pre-training For Abstractive Document Summarization By Reinstating Source Text
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 42 citations
Zou et al.
-
A Vietnamese Dataset For Evaluating Machine Reading Comprehension
(2020)
β’ Proceedings of the 28th International Conference on Computational Linguistics
β’ 58 citations
Nguyen et al.
-
Zero-resource Knowledge-grounded Dialogue Generation
(2020)
β’ Arxiv
β’ 50 citations
Li et al.
-
Neural Deepfake Detection With Factual Structure Of Text
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 41 citations
Zhong et al.
-
Grappa: Grammar-augmented Pre-training For Table Semantic Parsing
(2020)
β’ Arxiv
β’ 59 citations
Yu et al.
-
Pre-training Multilingual Neural Machine Translation By Leveraging Alignment Information
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 82 citations
Lin et al.
-
Interbert: Vision-and-language Interaction For Multi-modal Pretraining
(2020)
β’ Arxiv
β’ 56 citations
Lin et al.
-
Mapping Natural Language Instructions To Mobile UI Action Sequences
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 71 citations
Li et al.
-
MM-COVID: A Multilingual And Multimodal Data Repository For Combating COVID-19 Disinformation
(2020)
β’ Arxiv
β’ 51 citations
Li et al.
-
Directions In Abusive Language Training Data: Garbage In, Garbage Out
(2020)
β’ PLOS ONE
β’ 132 citations
Bertie Vidgen, Leon Derczynski
-
Jointly Cross- And Self-modal Graph Attention Network For Query-based Moment Localization
(2020)
β’ Proceedings of the 28th ACM International Conference on Multimedia
β’ 108 citations
Liu et al.
-
X-stance: A Multilingual Multi-target Dataset For Stance Detection
(2020)
β’ Arxiv
β’ 45 citations
Jannis Vamvas, Rico Sennrich
-
Span-convert: Few-shot Span Extraction For Dialog With Pretrained Conversational Representations
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 43 citations
Coope et al.
-
Knowledge Graph-augmented Abstractive Summarization With Semantic-driven Cloze Reward
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 148 citations
Luyang Huang, Lingfei Wu, Lu Wang
-
Weakly-supervised Aspect-based Sentiment Analysis Via Joint Aspect-sentiment Topic Embedding
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 53 citations
Huang et al.
-
Frameaxis: Characterizing Microframe Bias And Intensity With Word Embedding
(2020)
β’ PeerJ Computer Science
β’ 43 citations
Kwak et al.
-
Enhancing Extractive Text Summarization With Topic-aware Graph Neural Networks
(2020)
β’ Proceedings of the 28th International Conference on Computational Linguistics
β’ 63 citations
Peng Cui, Le Hu, Yuanchao Liu
-
Mutual: A Dataset For Multi-turn Dialogue Reasoning
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 110 citations
Cui et al.
-
Data Augmentation Using Pre-trained Transformer Models
(2020)
β’ Proceedings of the 2nd Workshop on Life-long Learning for Spoken Language Systems
β’ 61 citations
Varun Kumar, Ashutosh Choudhary, Eunah Cho
-
Coco: Controllable Counterfactuals For Evaluating Dialogue State Trackers
(2020)
β’ Arxiv
β’ 41 citations
Li et al.
-
Med-bert: Pre-trained Contextualized Embeddings On Large-scale Structured Electronic Health Records For Disease Prediction
(2020)
β’ Arxiv
β’ 61 citations
Rasmy et al.
-
Cost-effective Selection Of Pretraining Data: A Case Study Of Pretraining BERT On Social Media
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 46 citations
Dai et al.
-
Few-shot Named Entity Recognition: A Comprehensive Study
(2020)
β’ Arxiv
β’ 51 citations
Huang et al.
-
Understanding Neural Abstractive Summarization Models Via Uncertainty
(2020)
β’ Proceedings of the 28th International Conference on Computational Linguistics
β’ 45 citations
Jiacheng Xu, Shrey Desai, Greg Durrett
-
Plotmachines: Outline-conditioned Generation With Dynamic Plot State Tracking
(2020)
β’ Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
β’ 97 citations
Rashkin et al.
-
Iterative Edit-based Unsupervised Sentence Simplification
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 57 citations
Kumar et al.
-
Dual-mode ASR: Unify And Improve Streaming ASR With Full-context Modeling
(2020)
β’ IEEE Transactions on Multimedia
β’ 50 citations
Yu et al.
-
AGIF: An Adaptive Graph-interactive Framework For Joint Multiple Intent Detection And Slot Filling
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 108 citations
Qin et al.
-
Vlanet: Video-language Alignment Network For Weakly-supervised Video Moment Retrieval
(2020)
β’ Lecture Notes in Computer Science
β’ 78 citations
Ma et al.
-
DMD: A Large-scale Multi-modal Driver Monitoring Dataset For Attention And Alertness Analysis
(2020)
β’ Lecture Notes in Computer Science
β’ 90 citations
Ortega et al.
-
Chart-to-text: Generating Natural Language Descriptions For Charts By Adapting The Transformer Model
(2020)
β’ Proceedings of the 13th International Conference on Natural Language Generation
β’ 46 citations
Jason Obeid, Enamul Hoque
-
Reclor: A Reading Comprehension Dataset Requiring Logical Reasoning
(2020)
β’ Arxiv
β’ 127 citations
Yu et al.
-
Towards Robustifying NLI Models Against Lexical Dataset Biases
(2020)
β’ Proceedings of the 28th International Conference on Computational Linguistics
β’ 143 citations
Xiang Zhou, Mohit Bansal
-
Fquad: French Question Answering Dataset
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 62 citations
D'Hoffschmidt et al.
-
Extractive Summarization As Text Matching
(2020)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 394 citations
Zhong et al.
-
Modality-agnostic Attention Fusion For Visual Search With Text Feedback
(2020)
β’ Arxiv
β’ 47 citations
Dodds et al.
-
Robbert: A Dutch Roberta-based Language Model
(2020)
β’ Findings of the Association for Computational Linguistics: EMNLP 2020
β’ 103 citations
Pieter Delobelle, Thomas Winters, Bettina Berendt
-
Structure-grounded Pretraining For Text-to-sql
(2020)
β’ Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
β’ 56 citations
Deng et al.
-
Chinese Street View Text: Large-scale Chinese Text Reading With Partially Supervised Learning
(2019)
β’ 2019 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 55 citations
Sun et al.
-
BIGPATENT: A Large-scale Dataset For Abstractive And Coherent Summarization
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 155 citations
Eva Sharma, Chen Li, Lu Wang
-
Evaluating The State-of-the-art Of End-to-end Natural Language Generation: The E2E NLG Challenge
(2019)
β’ Computer Speech & Language
β’ 180 citations
OndΕej DuΕ‘ek, Jekaterina Novikova, Verena Rieser
-
A Novel Bi-directional Interrelated Model For Joint Intent Detection And Slot Filling
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 188 citations
E et al.
-
Clevr-dialog: A Diagnostic Dataset For Multi-round Reasoning In Visual Dialog
(2019)
β’ Arxiv
β’ 49 citations
Kottur et al.
-
End-to-end Text-to-speech For Low-resource Languages By Cross-lingual Transfer Learning
(2019)
β’ Interspeech 2019
β’ 71 citations
Tu et al.
-
Neural Metric Learning For Fast End-to-end Relation Extraction
(2019)
β’ Arxiv
β’ 40 citations
Tung Tran, Ramakanth Kavuluru
-
Recommendation As A Communication Game: Self-supervised Bot-play For Goal-oriented Dialogue
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 80 citations
Kang et al.
-
Automated Essay Scoring Based On Two-stage Learning
(2019)
β’ Arxiv
β’ 45 citations
Jiawei Liu, Yang Xu, Yaguang Zhu
-
Self-attention Aligner: A Latency-control End-to-end Model For ASR Using Self-attention Network And Chunk-hopping
(2019)
β’ ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
β’ 100 citations
Linhao Dong, Feng Wang, Bo Xu
-
Mirrorgan: Learning Text-to-image Generation By Redescription
(2019)
β’ 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 589 citations
Qiao et al.
-
Polysemous Visual-semantic Embedding For Cross-modal Retrieval
(2019)
β’ 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 243 citations
Yale Song, Mohammad Soleymani
-
Cosql: A Conversational Text-to-sql Challenge Towards Cross-domain Natural Language Interfaces To Databases
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 122 citations
Yu et al.
-
TWEETQA: A Social Media Focused Question Answering Dataset
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 64 citations
Xiong et al.
-
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs
(2019)
β’ Arxiv
β’ 96 citations
Dua et al.
-
An Entity-driven Framework For Abstractive Summarization
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 59 citations
Sharma et al.
-
Activitynet-qa: A Dataset For Understanding Complex Web Videos Via Question Answering
(2019)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 199 citations
Yu et al.
-
Connecting The Dots: Document-level Neural Relation Extraction With Edge-oriented Graphs
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 229 citations
Fenia Christopoulou, Makoto Miwa, Sophia Ananiadou
-
The Flores Evaluation Datasets For Low-resource Machine Translation: Nepali-english And Sinhala-english
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 92 citations
GuzmΓ‘n et al.
-
Entity, Relation, And Event Extraction With Contextualized Span Representations
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 275 citations
Wadden et al.
-
Adapting Text Embeddings For Causal Inference
(2019)
β’ Arxiv
β’ 51 citations
Victor Veitch, Dhanya Sridhar, David M. Blei
-
Multilingual Is Not Enough: BERT For Finnish
(2019)
β’ Arxiv
β’ 121 citations
Virtanen et al.
-
Improving Short Text Classification Through Global Augmentation Methods
(2019)
β’ Lecture Notes in Computer Science
β’ 61 citations
Vukosi Marivate, Tshephisho Sefara
-
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
(2019)
β’ 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 545 citations
Marino et al.
-
Earlier Attention? Aspect-aware LSTM For Aspect-based Sentiment Analysis
(2019)
β’ Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
β’ 50 citations
Xing et al.
-
Camembert: A Tasty French Language Model
(2019)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 336 citations
Martin et al.
-
Automatic Radiology Report Generation Based On Multi-view Image Fusion And Medical Concept Enrichment
(2019)
β’ Lecture Notes in Computer Science
β’ 165 citations
Yuan et al.
-
Mixture Content Selection For Diverse Sequence Generation
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 56 citations
Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
-
Improving The Similarity Measure Of Determinantal Point Processes For Extractive Multi-document Summarization
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 61 citations
Cho et al.
-
Learning Dual Retrieval Module For Semi-supervised Relation Extraction
(2019)
β’ The World Wide Web Conference
β’ 61 citations
Lin et al.
-
CONAN -- Counter Narratives Through Nichesourcing: A Multilingual Dataset Of Responses To Fight Online Hate Speech
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 78 citations
Chung et al.
-
Distilling Task-specific Knowledge From BERT Into Simple Neural Networks
(2019)
β’ Arxiv
β’ 337 citations
Tang et al.
-
Meta-learning With Dynamic-memory-based Prototypical Network For Few-shot Event Detection
(2019)
β’ Proceedings of the 13th International Conference on Web Search and Data Mining
β’ 69 citations
Deng et al.
-
A Simple But Effective Method To Incorporate Multi-turn Context With BERT For Conversational Machine Comprehension
(2019)
β’ Proceedings of the First Workshop on NLP for Conversational AI
β’ 42 citations
Ohsugi et al.
-
Coupling Retrieval And Meta-learning For Context-dependent Semantic Parsing
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 44 citations
Guo et al.
-
End-to-end Bias Mitigation By Modelling Biases In Corpora
(2019)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 134 citations
Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
-
Expressing Visual Relationships Via Language
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 42 citations
Tan et al.
-
ELI5: Long Form Question Answering
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 284 citations
Fan et al.
-
Cosmos QA: Machine Reading Comprehension With Contextual Commonsense Reasoning
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 271 citations
Huang et al.
-
Transforming Delete, Retrieve, Generate Approach For Controlled Text Style Transfer
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 43 citations
Akhilesh Sudhakar, Bhargav Upadhyay, Arjun Maheswaran
-
Benchmarking Zero-shot Text Classification: Datasets, Evaluation And Entailment Approach
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 219 citations
Wenpeng Yin, Jamaal Hay, Dan Roth
-
A Systematic Comparison Of Methods For Low-resource Dependency Parsing On Genuinely Low-resource Languages
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 51 citations
Vania et al.
-
Amazonqa: A Review-based Question Answering Task
(2019)
β’ Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
β’ 45 citations
Gupta et al.
-
Text Readability Assessment For Second Language Learners
(2019)
β’ Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications
β’ 138 citations
Menglin Xia, Ekaterina Kochmar, Ted Briscoe
-
Pretrained Language Models For Sequential Sentence Classification
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 51 citations
Cohan et al.
-
Do Sentence Interactions Matter? Leveraging Sentence Level Representations For Fake News Classification
(2019)
β’ Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13)
β’ 80 citations
Vaibhav Vaibhav, Raghuram Mandyam Annasamy, Eduard Hovy
-
Speech Model Pre-training For End-to-end Spoken Language Understanding
(2019)
β’ Interspeech 2019
β’ 41 citations
Lugosch et al.
-
Improving Multi-turn Dialogue Modelling With Utterance Rewriter
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 106 citations
Su et al.
-
Context-aware Visual Policy Network For Fine-grained Image Captioning
(2019)
β’ IEEE Transactions on Pattern Analysis and Machine Intelligence
β’ 151 citations
Zha et al.
-
Olmpics -- On What Language Model Pre-training Captures
(2019)
β’ Transactions of the Association for Computational Linguistics
β’ 55 citations
Talmor et al.
-
Hellaswag: Can A Machine Really Finish Your Sentence?
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 572 citations
Zellers et al.
-
Spatio-temporal Dynamics And Semantic Attribute Enriched Visual Encoding For Video Captioning
(2019)
β’ 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 234 citations
Aafaq et al.
-
A Question-entailment Approach To Question Answering
(2019)
β’ BMC Bioinformatics
β’ 174 citations
Asma Ben Abacha, Dina Demner-Fushman
-
75 Languages, 1 Model: Parsing Universal Dependencies Universally
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 61 citations
Dan Kondratyuk, Milan Straka
-
Unlearn Dataset Bias In Natural Language Inference By Fitting The Residual
(2019)
β’ Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)
β’ 164 citations
He He, Sheng Zha, Haohan Wang
-
Visual Entailment: A Novel Task For Fine-grained Image Understanding
(2019)
β’ Arxiv
β’ 162 citations
Xie et al.
-
Generating Token-level Explanations For Natural Language Inference
(2019)
β’ Proceedings of the 2019 Conference of the North
β’ 41 citations
Thorne et al.
-
Learning To Generalize From Sparse And Underspecified Rewards
(2019)
β’ Proceedings of the 36th International Conference on Machine Learning PMLR 97130-140 2019
β’ 46 citations
Agarwal et al.
-
Juice: A Large Scale Distantly Supervised Dataset For Open Domain Context-based Code Generation
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 43 citations
Rajas Agashe, Srinivasan Iyer, Luke Zettlemoyer
-
A Unified MRC Framework For Named Entity Recognition
(2019)
β’ Arxiv
β’ 50 citations
Li et al.
-
Jasper: An End-to-end Convolutional Neural Acoustic Model
(2019)
β’ Interspeech 2019
β’ 212 citations
Li et al.
-
Learning The Difference That Makes A Difference With Counterfactually-augmented Data
(2019)
β’ Arxiv
β’ 231 citations
Divyansh Kaushik, Eduard Hovy, Zachary C. Lipton
-
PEGASUS: Pre-training With Extracted Gap-sentences For Abstractive Summarization
(2019)
β’ Arxiv
β’ 976 citations
Zhang et al.
-
Reconstruct And Represent Video Contents For Captioning Via Reinforcement Learning
(2019)
β’ IEEE Transactions on Pattern Analysis and Machine Intelligence
β’ 79 citations
Zhang et al.
-
Probing What Different NLP Tasks Teach Machines About Function Word Comprehension
(2019)
β’ Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)
β’ 94 citations
Kim et al.
-
The Effect Of Translationese In Machine Translation Test Sets
(2019)
β’ Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers)
β’ 69 citations
Mike Zhang, Antonio Toral
-
Induction Networks For Few-shot Text Classification
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 197 citations
Geng et al.
-
HIBERT: Document Level Pre-training Of Hierarchical Bidirectional Transformers For Document Summarization
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 161 citations
Xingxing Zhang, Furu Wei, Ming Zhou
-
Mathqa: Towards Interpretable Math Word Problem Solving With Operation-based Formalisms
(2019)
β’ Arxiv
β’ 119 citations
Amini et al.
-
Controllable Dual Skew Divergence Loss For Neural Machine Translation
(2019)
β’ Arxiv
β’ 79 citations
Li et al.
-
Vision-and-dialog Navigation
(2019)
β’ Arxiv
β’ 118 citations
Thomason et al.
-
UER: An Open-source Toolkit For Pre-training Models
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations
β’ 88 citations
Zhao et al.
-
NAS Evaluation Is Frustratingly Hard
(2019)
β’ Arxiv
β’ 109 citations
Antoine Yang, Pedro M. EsperanΓ§a, Fabio M. Carlucci
-
Simple And Effective Text Matching With Richer Alignment Features
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 171 citations
Yang et al.
-
Grounding Human-to-vehicle Advice For Self-driving Vehicles
(2019)
β’ 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 101 citations
Kim et al.
-
Roberta: A Robustly Optimized BERT Pretraining Approach
(2019)
β’ Arxiv
β’ 16976 citations
Liu et al.
-
Automatic Argument Quality Assessment -- New Datasets And Methods
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 55 citations
Toledo et al.
-
Text Summarization With Pretrained Encoders
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 1525 citations
Yang Liu, Mirella Lapata
-
Summary Level Training Of Sentence Rewriting For Abstractive Summarization
(2019)
β’ Proceedings of the 2nd Workshop on New Frontiers in Summarization
β’ 61 citations
Bae et al.
-
Generalized Data Augmentation For Low-resource Translation
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 109 citations
Xia et al.
-
Improving Referring Expression Grounding With Cross-modal Attention-guided Erasing
(2019)
β’ 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 189 citations
Liu et al.
-
A Stack-propagation Framework With Token-level Intent Detection For Spoken Language Understanding
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 302 citations
Qin et al.
-
Bert-based Ranking For Biomedical Entity Normalization
(2019)
β’ Arxiv
β’ 93 citations
Zongcheng Ji, Qiang Wei, Hua Xu
-
Saliency-guided Attention Network For Image-sentence Matching
(2019)
β’ 2019 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 105 citations
Ji et al.
-
TANDA: Transfer And Adapt Pre-trained Transformer Models For Answer Sentence Selection
(2019)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 92 citations
Siddhant Garg, Thuy Vu, Alessandro Moschitti
-
MLQA: Evaluating Cross-lingual Extractive Question Answering
(2019)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 52 citations
Lewis et al.
-
Constrained Decoding For Neural NLG From Compositional Representations In Task-oriented Dialogue
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 74 citations
Balakrishnan et al.
-
Unsupervised Question Answering By Cloze Translation
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 136 citations
Patrick Lewis, Ludovic Denoyer, Sebastian Riedel
-
Scibert: A Pretrained Language Model For Scientific Text
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 1631 citations
Iz Beltagy, Kyle Lo, Arman Cohan
-
NCLS: Neural Cross-lingual Summarization
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 121 citations
Zhu et al.
-
Multi-task Deep Neural Networks For Natural Language Understanding
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 1026 citations
Liu et al.
-
Humor Detection: A Transformer Gets The Last Laugh
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 44 citations
Orion Weller, Kevin Seppi
-
PAWS-X: A Cross-lingual Adversarial Dataset For Paraphrase Identification
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 222 citations
Yang et al.
-
Enhancing Amr-to-text Generation With Dual Graph Representations
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 57 citations
Leonardo F. R. Ribeiro, Claire Gardent, Iryna Gurevych
-
Sampling Bias In Deep Active Classification: An Empirical Study
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 45 citations
Ameya Prabhu, Charles Dognin, Maneesh Singh
-
GEAR: Graph-based Evidence Aggregating And Reasoning For Fact Verification
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 196 citations
Zhou et al.
-
Knowledge Aware Conversation Generation With Explainable Reasoning Over Augmented Graphs
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 95 citations
Liu et al.
-
Topic-enhanced Memory Networks For Personalised Point-of-interest Recommendation
(2019)
β’ Arxiv
β’ 54 citations
Xiao Zhou, Cecilia Mascolo, Zhongxiang Zhao
-
Multi-task Learning With Language Modeling For Question Generation
(2019)
β’ Arxiv
β’ 58 citations
Wenjie Zhou, Minghua Zhang, Yunfang Wu
-
Cm-net: A Novel Collaborative Memory Network For Spoken Language Understanding
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 89 citations
Liu et al.
-
Counterfactual Story Reasoning And Generation
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 98 citations
Qin et al.
-
A Novel Aspect-guided Deep Transition Model For Aspect Based Sentiment Analysis
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 49 citations
Liang et al.
-
Taskmaster-1: Toward A Realistic And Diverse Dialog Dataset
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 134 citations
Byrne et al.
-
Assessing The Factual Accuracy Of Generated Text
(2019)
β’ Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
β’ 145 citations
Goodrich et al.
-
Winogrande: An Adversarial Winograd Schema Challenge At Scale
(2019)
β’ Arxiv
β’ 83 citations
Sakaguchi et al.
-
Transfer Learning In Biomedical Natural Language Processing: An Evaluation Of BERT And Elmo On Ten Benchmarking Datasets
(2019)
β’ Proceedings of the 18th BioNLP Workshop and Shared Task
β’ 792 citations
Yifan Peng, Shankai Yan, Zhiyong Lu
-
Addressing Semantic Drift In Question Generation For Semi-supervised Question Answering
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 109 citations
Shiyue Zhang, Mohit Bansal
-
Howto100m: Learning A Text-video Embedding By Watching Hundred Million Narrated Video Clips
(2019)
β’ 2019 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 117 citations
Miech et al.
-
Measuring Compositional Generalization: A Comprehensive Method On Realistic Data
(2019)
β’ Arxiv
β’ 55 citations
Keysers et al.
-
Personalized Dialogue Generation With Diversified Traits
(2019)
β’ Arxiv
β’ 89 citations
Zheng et al.
-
Clevr-ref+: Diagnosing Visual Reasoning With Referring Expressions
(2019)
β’ 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 101 citations
Liu et al.
-
From Senones To Chenones: Tied Context-dependent Graphemes For Hybrid Speech Recognition
(2019)
β’ 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
β’ 72 citations
Le et al.
-
Sentence Centrality Revisited For Unsupervised Summarization
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 175 citations
Hao Zheng, Mirella Lapata
-
Proactive Human-machine Conversation With Explicit Conversation Goals
(2019)
β’ Arxiv
β’ 41 citations
Wu et al.
-
Pubmedqa: A Dataset For Biomedical Research Question Answering
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 425 citations
Jin et al.
-
Adversarial NLI: A New Benchmark For Natural Language Understanding
(2019)
β’ Arxiv
β’ 66 citations
Nie et al.
-
Crossweigh: Training Named Entity Tagger From Imperfect Annotations
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 96 citations
Wang et al.
-
Mining Discourse Markers For Unsupervised Sentence Representation Learning
(2019)
β’ Proceedings of the 2019 Conference of the North
β’ 41 citations
Sileo et al.
-
Fine-tune Bert For Docred With Two-step Process
(2019)
β’ Arxiv
β’ 116 citations
Wang et al.
-
Integrating Multimodal Information In Large Pretrained Transformers
(2019)
β’ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
β’ 421 citations
Rahman et al.
-
Evidence Sentence Extraction For Machine Reading Comprehension
(2019)
β’ Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)
β’ 45 citations
Wang et al.
-
Modeling Sentiment Dependencies With Graph Convolutional Networks For Aspect-level Sentiment Classification
(2019)
β’ Knowledge-Based Systems
β’ 180 citations
Pinlong Zhaoa, Linlin Houb, Ou Wua
-
Automatic Spanish Translation Of The Squad Dataset For Multilingual Question Answering
(2019)
β’ Arxiv
β’ 42 citations
Casimiro Pio Carrino, Marta R. Costa-JussΓ , JosΓ© A. R. Fonollosa
-
VATEX: A Large-scale, High-quality Multilingual Dataset For Video-and-language Research
(2019)
β’ 2019 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 326 citations
Wang et al.
-
Weakly-supervised Spatio-temporally Grounding Natural Sentence In Video
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 101 citations
Chen et al.
-
Superglue: A Stickier Benchmark For General-purpose Language Understanding Systems
(2019)
β’ Arxiv
β’ 984 citations
Wang et al.
-
BERT Post-training For Review Reading Comprehension And Aspect-based Sentiment Analysis
(2019)
β’ Arxiv
β’ 358 citations
Xu et al.
-
A Constructive Prediction Of The Generalization Error Across Scales
(2019)
β’ Arxiv
β’ 49 citations
Rosenfeld et al.
-
Matching Images And Text With Multi-modal Tensor Fusion And Re-ranking
(2019)
β’ Proceedings of the 27th ACM International Conference on Multimedia
β’ 145 citations
Wang et al.
-
Trouble On The Horizon: Forecasting The Derailment Of Online Conversations As They Develop
(2019)
β’ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
β’ 45 citations
Jonathan P. Chang, Cristian Danescu-Niculescu-Mizil
-
A Comprehensive Exploration On Wikisql With Table-aware Word Contextualization
(2019)
β’ Arxiv
β’ 122 citations
Hwang et al.
-
UNITER: Universal Image-text Representation Learning
(2019)
β’ Arxiv
β’ 183 citations
Chen et al.
-
A Hierarchical Reinforced Sequence Operation Method For Unsupervised Text Style Transfer
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 53 citations
Wu et al.
-
Understanding Dataset Design Choices For Multi-hop Reasoning
(2019)
β’ Proceedings of the 2019 Conference of the North
β’ 90 citations
Jifan Chen, Greg Durrett
-
Multi-hop Question Answering Via Reasoning Chains
(2019)
β’ Arxiv
β’ 66 citations
Jifan Chen, Shih-Ting Lin, Greg Durrett
-
Blackmarks: Blackbox Multibit Watermarking For Deep Neural Networks
(2019)
β’ Arxiv
β’ 41 citations
Huili Chen, Bita Darvish Rouhani, Farinaz Koushanfar
-
Adaptive Embedding Gate For Attention-based Scene Text Recognition
(2019)
β’ Neurocomputing
β’ 41 citations
Chen et al.
-
Complementary Fusion Of Multi-features And Multi-modalities In Sentiment Analysis
(2019)
β’ Arxiv
β’ 53 citations
Chen et al.
-
Deep Short Text Classification With Knowledge Powered Attention
(2019)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 132 citations
Chen et al.
-
Temporal Deformable Convolutional Encoder-decoder Networks For Video Captioning
(2019)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 99 citations
Chen et al.
-
Review-driven Answer Generation For Product-related Questions In E-commerce
(2019)
β’ Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining
β’ 50 citations
Chen et al.
-
Tabfact: A Large-scale Dataset For Table-based Fact Verification
(2019)
β’ Arxiv
β’ 179 citations
Chen et al.
-
Docred: A Large-scale Document-level Relation Extraction Dataset
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 479 citations
Yao et al.
-
Higru: Hierarchical Gated Recurrent Units For Utterance-level Emotion Recognition
(2019)
β’ Arxiv
β’ 70 citations
Jiao et al.
-
HELP: A Dataset For Identifying Shortcomings Of Neural Models In Monotonicity Reasoning
(2019)
β’ Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)
β’ 50 citations
Yanaka et al.
-
Explain Yourself! Leveraging Language Models For Commonsense Reasoning
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 86 citations
Rajani et al.
-
Conversing By Reading: Contentful Neural Conversation With On-demand Machine Reading
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 97 citations
Qin et al.
-
GQA: A New Dataset For Real-world Visual Reasoning And Compositional Question Answering
(2019)
β’ 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 132 citations
Drew A. Hudson, Christopher D. Manning
-
Convlab: Multi-domain End-to-end Dialog System Platform
(2019)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
β’ 87 citations
Lee et al.
-
Hotpotqa: A Dataset For Diverse, Explainable Multi-hop Question Answering
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 540 citations
Yang et al.
-
What Makes Reading Comprehension Questions Easier?
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 102 citations
Sugawara et al.
-
Building A Conversational Agent Overnight With Dialogue Self-play
(2018)
β’ Arxiv
β’ 161 citations
Shah et al.
-
Staqc: A Systematically Mined Question-code Dataset From Stack Overflow
(2018)
β’ the 2018 World Wide Web Conference
β’ 44 citations
Yao et al.
-
Know What You Don't Know: Unanswerable Questions For Squad
(2018)
β’ Arxiv
β’ 209 citations
Pranav Rajpurkar, Robin Jia, Percy Liang
-
Towards Explainable NLP: A Generative Explanation Framework For Text Classification
(2018)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 141 citations
Hui Liu, Qingyu Yin, William Yang Wang
-
Multimodal Explanations: Justifying Decisions And Pointing To The Evidence
(2018)
β’ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
β’ 101 citations
Park et al.
-
Learning A Text-video Embedding From Incomplete And Heterogeneous Data
(2018)
β’ Arxiv
β’ 174 citations
Antoine Miech, Ivan Laptev, Josef Sivic
-
Wronging A Right: Generating Better Errors To Improve Grammatical Error Detection
(2018)
β’ Arxiv
β’ 56 citations
Sudhanshu Kasewa, Pontus Stenetorp, Sebastian Riedel
-
Word2vec Applied To Recommendation: Hyperparameters Matter
(2018)
β’ Arxiv
β’ 44 citations
Hugo Caselles-DuprΓ©, Florian Lesaint, Jimena Royo-Letelier
-
Neural Aesthetic Image Reviewer
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 107 citations
Wang et al.
-
Neural Baby Talk
(2018)
β’ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
β’ 442 citations
Lu et al.
-
Adventure: Adversarial Training For Textual Entailment With Knowledge-guided Examples
(2018)
β’ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 64 citations
Kang et al.
-
ODSQA: Open-domain Spoken Question Answering Dataset
(2018)
β’ 2018 IEEE Spoken Language Technology Workshop (SLT)
β’ 42 citations
Lee et al.
-
Complex Sequential Question Answering: Towards Learning To Converse Over Linked Question Answer Pairs With A Knowledge Graph
(2018)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 172 citations
Saha et al.
-
Emrqa: A Large Corpus For Question Answering On Electronic Medical Records
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 164 citations
Pampari et al.
-
End-to-end Non-autoregressive Neural Machine Translation With Connectionist Temporal Classification
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 132 citations
JindΕich LibovickΓ½, JindΕich Helcl
-
Improving Text-to-sql Evaluation Methodology
(2018)
β’ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 213 citations
Finegan-Dollak et al.
-
Preco: A Large-scale Dataset In Preschool Vocabulary For Coreference Resolution
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 44 citations
Chen et al.
-
Can A Suit Of Armor Conduct Electricity? A New Dataset For Open Book Question Answering
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 557 citations
Mihaylov et al.
-
Born Again Neural Networks
(2018)
β’ Arxiv
β’ 442 citations
Furlanello et al.
-
A Reinforced Topic-aware Convolutional Sequence-to-sequence Model For Abstractive Text Summarization
(2018)
β’ Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
β’ 127 citations
Wang et al.
-
Unsupervised Discrete Sentence Representation Learning For Interpretable Neural Dialog Generation
(2018)
β’ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 136 citations
Tiancheng Zhao, Kyusong Lee, Maxine Eskenazi
-
Gender Bias In Neural Natural Language Processing
(2018)
β’ Arxiv
β’ 73 citations
Lu et al.
-
Multi-modal Data Augmentation For End-to-end ASR
(2018)
β’ Interspeech 2018
β’ 54 citations
Renduchintala et al.
-
Hierarchical Neural Story Generation
(2018)
β’ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 1076 citations
Angela Fan, Mike Lewis, Yann Dauphin
-
Video Description: A Survey Of Methods, Datasets And Evaluation Metrics
(2018)
β’ ACM Computing Surveys
β’ 138 citations
Aafaq et al.
-
Mem2seq: Effectively Incorporating Knowledge Bases Into End-to-end Task-oriented Dialog Systems
(2018)
β’ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 44 citations
Andrea Madotto, Chien-Sheng Wu, Pascale Fung
-
Reasoning About Actions And State Changes By Injecting Commonsense Knowledge
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 84 citations
Tandon et al.
-
Sarcasm Analysis Using Conversation Context
(2018)
β’ Computational Linguistics
β’ 77 citations
Debanjan Ghosh, Alexander R. Fabbri, Smaranda Muresan
-
Query And Output: Generating Words By Querying Distributed Word Representations For Paraphrase Generation
(2018)
β’ Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
β’ 67 citations
Ma et al.
-
AISHELL-2: Transforming Mandarin ASR Research Into Industrial Scale
(2018)
β’ Arxiv
β’ 201 citations
Du et al.
-
Transforming Question Answering Datasets Into Natural Language Inference Datasets
(2018)
β’ Arxiv
β’ 122 citations
Dorottya Demszky, Kelvin Guu, Percy Liang
-
GLAC Net: Glocal Attention Cascading Networks For Multi-image Cued Story Generation
(2018)
β’ Arxiv
β’ 53 citations
Kim et al.
-
Nocaps: Novel Object Captioning At Scale
(2018)
β’ 2019 IEEE/CVF International Conference on Computer Vision (ICCV)
β’ 233 citations
Agrawal et al.
-
A Retrospective Analysis Of The Fake News Challenge Stance Detection Task
(2018)
β’ Arxiv
β’ 68 citations
Hanselowski et al.
-
Textual Explanations For Self-driving Vehicles
(2018)
β’ Lecture Notes in Computer Science
β’ 283 citations
Kim et al.
-
A Large-scale Corpus For Conversation Disentanglement
(2018)
β’ Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
β’ 86 citations
Kummerfeld et al.
-
Event2mind: Commonsense Inference On Events, Intents, And Reactions
(2018)
β’ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 40 citations
Rashkin et al.
-
End-to-end Neural Entity Linking
(2018)
β’ Proceedings of the 22nd Conference on Computational Natural Language Learning
β’ 233 citations
Nikolaos Kolitsas, Octavian-Eugen Ganea, Thomas Hofmann
-
XNLI: Evaluating Cross-lingual Sentence Representations
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 906 citations
Conneau et al.
-
Autoencoder As Assistant Supervisor: Improving Text Representation For Chinese Social Media Text Summarization
(2018)
β’ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
β’ 43 citations
Ma et al.
-
Wikihow: A Large Scale Text Summarization Dataset
(2018)
β’ Arxiv
β’ 177 citations
Mahnaz Koupaee, William Yang Wang
-
Contextual Parameter Generation For Universal Neural Machine Translation
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 155 citations
Platanios et al.
-
Textfield: Learning A Deep Direction Field For Irregular Scene Text Detection
(2018)
β’ IEEE Transactions on Image Processing
β’ 355 citations
Xu et al.
-
Back-translation-style Data Augmentation For End-to-end ASR
(2018)
β’ 2018 IEEE Spoken Language Technology Workshop (SLT)
β’ 95 citations
Hayashi et al.
-
A Hierarchical Structured Self-attentive Model For Extractive Document Summarization (HSSAS)
(2018)
β’ IEEE Access
β’ 127 citations
Kamal Al-Sabahi, Zhang Zuping, Mohammed Nadher
-
SWAG: A Large-scale Adversarial Dataset For Grounded Commonsense Inference
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 103 citations
Zellers et al.
-
Self-attentive Sequential Recommendation
(2018)
β’ 2018 IEEE International Conference on Data Mining (ICDM)
β’ 2379 citations
Wang-Cheng Kang, Julian McAuley
-
E-snli: Natural Language Inference With Natural Language Explanations
(2018)
β’ Arxiv
β’ 282 citations
Camburu et al.
-
Abstractive Summarization Of Reddit Posts With Multi-level Memory Networks
(2018)
β’ Arxiv
β’ 60 citations
Byeongchang Kim, Hyunwoo Kim, Gunhee Kim
-
An End-to-end Textspotter With Explicit Alignment And Attention
(2018)
β’ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
β’ 237 citations
He et al.
-
Multiwoz -- A Large-scale Multi-domain Wizard-of-oz Dataset For Task-oriented Dialogue Modelling
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 312 citations
Budzianowski et al.
-
Multi-pointer Co-attention Networks For Recommendation
(2018)
β’ Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
β’ 220 citations
Yi Tay, Luu Anh Tuan, Siu Cheung Hui
-
Coqa: A Conversational Question Answering Challenge
(2018)
β’ Transactions of the Association for Computational Linguistics
β’ 97 citations
Siva Reddy, Danqi Chen, Christopher D. Manning
-
Diverse Few-shot Text Classification With Multiple Metrics
(2018)
β’ Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
β’ 230 citations
Yu et al.
-
Show, Tell And Discriminate: Image Captioning By Self-retrieval With Partially Labeled Data
(2018)
β’ Lecture Notes in Computer Science
β’ 83 citations
Liu et al.
-
Simple Unsupervised Keyphrase Extraction Using Sentence Embeddings
(2018)
β’ Proceedings of the 22nd Conference on Computational Natural Language Learning
β’ 224 citations
Bennani-Smires et al.
-
End-to-end Dense Video Captioning With Masked Transformer
(2018)
β’ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 401 citations
Zhou et al.
-
Densely Connected Bidirectional LSTM With Applications To Sentence Classification
(2018)
β’ Lecture Notes in Computer Science
β’ 64 citations
Ding et al.
-
Adversarially Regularising Neural NLI Models To Integrate Logical Background Knowledge
(2018)
β’ Arxiv
β’ 43 citations
Pasquale Minervini, Sebastian Riedel
-
Tracking State Changes In Procedural Text: A Challenge Dataset And Models For Process Paragraph Comprehension
(2018)
β’ Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
β’ 109 citations
Mishra et al.
-
Towards Exploiting Background Knowledge For Building Conversation Systems
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 158 citations
Moghe et al.
-
Towards Deep Conversational Recommendations
(2018)
β’ Arxiv
β’ 123 citations
Li et al.
-
A Discourse-aware Attention Model For Abstractive Summarization Of Long Documents
(2018)
β’ Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)
β’ 68 citations
Cohan et al.
-
Microsoft Dialogue Challenge: Building End-to-end Task-completion Dialogue Systems
(2018)
β’ Arxiv
β’ 60 citations
Li et al.
-
A Hierarchical End-to-end Model For Jointly Improving Text Summarization And Sentiment Classification
(2018)
β’ Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
β’ 62 citations
Ma et al.
-
A Hierarchical Latent Structure For Variational Conversation Modeling
(2018)
β’ Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
β’ 102 citations
Yookoon Park, Jaemin Cho, Gunhee Kim
-
Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context
(2018)
β’ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 115 citations
Khandelwal et al.
-
Meansum: A Neural Model For Unsupervised Multi-document Abstractive Summarization
(2018)
β’ Arxiv
β’ 97 citations
Eric Chu, Peter J. Liu
-
Large Scale Distributed Neural Network Training Through Online Distillation
(2018)
β’ Arxiv
β’ 152 citations
Anil et al.
-
Escape: A Large-scale Synthetic Corpus For Automatic Post-editing
(2018)
β’ Arxiv
β’ 50 citations
Negri et al.
-
Learning To Mine Aligned Code And Natural Language Pairs From Stack Overflow
(2018)
β’ Proceedings of the 15th International Conference on Mining Software Repositories
β’ 183 citations
Yin et al.
-
Multilingual Extractive Reading Comprehension By Runtime Machine Translation
(2018)
β’ Arxiv
β’ 59 citations
Asai et al.
-
Pythia V0.1: The Winning Entry To The VQA Challenge 2018
(2018)
β’ Arxiv
β’ 165 citations
Jiang et al.
-
Aspect Term Extraction With History Attention And Selective Transformation
(2018)
β’ Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
β’ 275 citations
Li et al.
-
Delete, Retrieve, Generate: A Simple Approach To Sentiment And Style Transfer
(2018)
β’ Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
β’ 114 citations
Li et al.
-
Spider: A Large-scale Human-labeled Dataset For Complex And Cross-domain Semantic Parsing And Text-to-sql Task
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 672 citations
Yu et al.
-
Adversarial Removal Of Demographic Attributes From Text Data
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 243 citations
Yanai Elazar, Yoav Goldberg
-
Quac : Question Answering In Context
(2018)
β’ Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
β’ 678 citations
Choi et al.
-
Scaling Neural Machine Translation
(2018)
β’ Proceedings of the Third Conference on Machine Translation: Research Papers
β’ 80 citations
Ott et al.
-
Learning Private Neural Language Modeling With Attentive Aggregation
(2018)
β’ 2019 International Joint Conference on Neural Networks (IJCNN)
β’ 93 citations
Ji et al.
-
Baseline Needs More Love: On Simple Word-embedding-based Models And Associated Pooling Mechanisms
(2018)
β’ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 319 citations
Shen et al.
-
Harvesting Paragraph-level Question-answer Pairs From Wikipedia
(2018)
β’ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 169 citations
Xinya Du, Claire Cardie
-
Textsnake: A Flexible Representation For Detecting Text Of Arbitrary Shapes
(2018)
β’ Lecture Notes in Computer Science
β’ 623 citations
Long et al.
-
Style Transfer As Unsupervised Machine Translation
(2018)
β’ Arxiv
β’ 114 citations
Zhang et al.
-
Table-to-text: Describing Table Region With Natural Language
(2018)
β’ Proceedings of the AAAI Conference on Artificial Intelligence
β’ 62 citations
Bao et al.
-
FOTS: Fast Oriented Text Spotting With A Unified Network
(2018)
β’ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
β’ 563 citations
Liu et al.
-
Neural Abstractive Text Summarization With Sequence-to-sequence Models
(2018)
β’ Arxiv
β’ 68 citations
Shi et al.
-
Reinforced Self-attention Network: A Hybrid Of Hard And Soft Attention For Sequence Modeling
(2018)
β’ Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
β’ 134 citations
Shen et al.
-
Faithful To The Original: Fact Aware Neural Abstractive Summarization
(2017)
β’ Arxiv
β’ 174 citations
Cao et al.
-
Deep Active Learning For Named Entity Recognition
(2017)
β’ Proceedings of the 2nd Workshop on Representation Learning for NLP
β’ 364 citations
Shen et al.
-
Dense-captioning Events In Videos
(2017)
β’ 2017 IEEE International Conference on Computer Vision (ICCV)
β’ 50 citations
Krishna et al.
-
FOIL It! Find One Mismatch Between Image And Language Caption
(2017)
β’ Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 48 citations
Shekhar et al.
-
Disan: Directional Self-attention Network For Rnn/cnn-free Language Understanding
(2017)
β’ Arxiv
β’ 113 citations
Shen et al.
-
A Deep Reinforced Model For Abstractive Summarization
(2017)
β’ Arxiv
β’ 1273 citations
Romain Paulus, Caiming Xiong, Richard Socher
-
A Parallel Corpus Of Python Functions And Documentation Strings For Automated Code Documentation And Code Generation
(2017)
β’ Arxiv
β’ 62 citations
Antonio Valerio Miceli Barone, Rico Sennrich
-
TALL: Temporal Activity Localization Via Language Query
(2017)
β’ 2017 IEEE International Conference on Computer Vision (ICCV)
β’ 768 citations
Gao et al.
-
Flexible End-to-end Dialogue System For Knowledge Grounded Conversation
(2017)
β’ Arxiv
β’ 88 citations
Zhu et al.
-
Improved Variational Autoencoders For Text Modeling Using Dilated Convolutions
(2017)
β’ Arxiv
β’ 94 citations
Yang et al.
-
Learning To Generate Reviews And Discovering Sentiment
(2017)
β’ Arxiv
β’ 350 citations
Alec Radford, Rafal Jozefowicz, Ilya Sutskever
-
Latent Relational Metric Learning Via Memory-based Attention For Collaborative Ranking
(2017)
β’ Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18
β’ 214 citations
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
-
Variational Reasoning For Question Answering With Knowledge Graph
(2017)
β’ Arxiv
β’ 180 citations
Zhang et al.
-
Neural Rating Regression With Abstractive Tips Generation For Recommendation
(2017)
β’ Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
β’ 288 citations
Li et al.
-
Quasar: Datasets For Question Answering By Search And Reading
(2017)
β’ Arxiv
β’ 139 citations
Bhuwan Dhingra, Kathryn Mazaitis, William W. Cohen
-
Personalization In Goal-oriented Dialog
(2017)
β’ Arxiv
β’ 63 citations
Chaitanya K. Joshi, Fei Mi, Boi Faltings
-
Attend And Diagnose: Clinical Time Series Analysis Using Attention Models
(2017)
β’ Arxiv
β’ 41 citations
Song et al.
-
Inter-session Modeling For Session-based Recommendation
(2017)
β’ Proceedings of the 2nd Workshop on Deep Learning for Recommender Systems
β’ 72 citations
Massimiliano Ruocco, Ole Steinar LillestΓΈl Skrede, Helge Langseth
-
I2T2I: Learning Text To Image Synthesis With Textual Data Augmentation
(2017)
β’ 2017 IEEE International Conference on Image Processing (ICIP)
β’ 60 citations
Dong et al.
-
Learning To Generate Long-term Future Via Hierarchical Prediction
(2017)
β’ Arxiv
β’ 180 citations
Villegas et al.
-
Deal Or No Deal? End-to-end Learning For Negotiation Dialogues
(2017)
β’ Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
β’ 51 citations
Lewis et al.
-
Seq2sql: Generating Structured Queries From Natural Language Using Reinforcement Learning
(2017)
β’ Arxiv
β’ 782 citations
Victor Zhong, Caiming Xiong, Richard Socher
-
Aspect-augmented Adversarial Networks For Domain Adaptation
(2017)
β’ Transactions of the Association for Computational Linguistics
β’ 93 citations
Yuan Zhang, Regina Barzilay, Tommi Jaakkola
-
Unconstrained Scene Text And Video Text Recognition For Arabic Script
(2017)
β’ 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR)
β’ 49 citations
Mohit Jain, Minesh Mathew, C. V. Jawahar
-
Fusionnet: Fusing Via Fully-aware Attention With Application To Machine Comprehension
(2017)
β’ Arxiv
β’ 86 citations
Huang et al.
-
Neural Natural Language Inference Models Enhanced With External Knowledge
(2017)
β’ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 271 citations
Chen et al.
-
Adacomp : Adaptive Residual Gradient Compression For Data-parallel Distributed Training
(2017)
β’ Arxiv
β’ 74 citations
Chen et al.
-
Key-value Retrieval Networks For Task-oriented Dialogue
(2017)
β’ Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue
β’ 51 citations
Mihail Eric, Christopher D. Manning
-
Incorporating Copying Mechanism In Image Captioning For Learning Novel Objects
(2017)
β’ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 149 citations
Yao et al.
-
Deep Gradient Compression: Reducing The Communication Bandwidth For Distributed Training
(2017)
β’ ICLR 2018
β’ 645 citations
Lin et al.
-
Dissent: Sentence Representation Learning From Explicit Discourse Relations
(2017)
β’ Arxiv
β’ 59 citations
Allen Nie, Erin D. Bennett, Noah D. Goodman
-
Video Captioning With Guidance Of Multimodal Latent Topics
(2017)
β’ Proceedings of the 25th ACM international conference on Multimedia
β’ 66 citations
Chen et al.
-
Gradnorm: Gradient Normalization For Adaptive Loss Balancing In Deep Multitask Networks
(2017)
β’ Proceedings of the 35th International Conference on Machine Learning (2018) 793-802
β’ 443 citations
Chen et al.
-
Just ASK: Building An Architecture For Extensible Self-service Spoken Language Understanding
(2017)
β’ Arxiv
β’ 56 citations
Kumar et al.
-
Don't Just Assume; Look And Answer: Overcoming Priors For Visual Question Answering
(2017)
β’ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
β’ 40 citations
Agrawal et al.
-
Attentive Memory Networks: Efficient Machine Reading For Conversational Search
(2017)
β’ Proceedings of 1st International Workshop on Conversational Approaches to Information Retrieval Tokyo Japan August 11 2017 (CAIR17)
β’ 40 citations
Tom Kenter, Maarten de Rijke
-
Dureader: A Chinese Machine Reading Comprehension Dataset From Real-world Applications
(2017)
β’ Arxiv
β’ 51 citations
He et al.
-
Image-grounded Conversations: Multimodal Context For Natural Question And Response Generation
(2017)
β’ Arxiv
β’ 117 citations
Mostafazadeh et al.
-
Neural Semantic Parsing By Character-based Translation: Experiments With Abstract Meaning Representations
(2017)
β’ Arxiv
β’ 82 citations
Rik van Noord, Johan Bos
-
Parlai: A Dialog Research Software Platform
(2017)
β’ Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
β’ 108 citations
Miller et al.
-
Simple And Effective Multi-paragraph Reading Comprehension
(2017)
β’ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 63 citations
Christopher Clark, Matt Gardner
-
Bpemb: Tokenization-free Pre-trained Subword Embeddings In 275 Languages
(2017)
β’ Arxiv
β’ 127 citations
Benjamin Heinzerling, Michael Strube
-
Learning To Compose Domain-specific Transformations For Data Augmentation
(2017)
β’ Advances in Neural Information Processing Systems 30 2017 3236--3246
β’ 182 citations
Ratner et al.
-
Skeleton Key: Image Captioning By Skeleton-attribute Decomposition
(2017)
β’ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 100 citations
Wang et al.
-
Question Answering Through Transfer Learning From Large Fine-grained Supervision Data
(2017)
β’ Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
β’ 114 citations
Sewon Min, Minjoon Seo, Hannaneh Hajishirzi
-
Accelerating Innovation Through Analogy Mining
(2017)
β’ Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
β’ 60 citations
Hope et al.
-
Generating High-quality And Informative Conversation Responses With Sequence-to-sequence Models
(2017)
β’ Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
β’ 187 citations
Shao et al.
-
Visual Question Answering: A Survey Of Methods And Datasets
(2016)
β’ Arxiv
β’ 44 citations
Wu et al.
-
Machine Comprehension Using Match-lstm And Answer Pointer
(2016)
β’ Arxiv
β’ 414 citations
Shuohang Wang, Jing Jiang
-
Collaborative Recurrent Autoencoder: Recommend While Learning To Fill In The Blanks
(2016)
β’ Arxiv
β’ 79 citations
Hao Wang, Xingjian Shi, Dit-Yan Yeung
-
A Hierarchical Model Of Reviews For Aspect-based Sentiment Analysis
(2016)
β’ Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
β’ 49 citations
Sebastian Ruder, Parsa Ghaffari, John G. Breslin
-
Dataset And Neural Recurrent Sequence Labeling Model For Open-domain Factoid Question Answering
(2016)
β’ Arxiv
β’ 68 citations
Li et al.
-
Multi-perspective Context Matching For Machine Comprehension
(2016)
β’ Arxiv
β’ 115 citations
Wang et al.
-
A Network-based End-to-end Trainable Task-oriented Dialogue System
(2016)
β’ Arxiv
β’ 170 citations
Wen et al.
-
Joint Copying And Restricted Generation For Paraphrase
(2016)
β’ Arxiv
β’ 61 citations
Cao et al.
-
Stackgan: Text To Photo-realistic Image Synthesis With Stacked Generative Adversarial Networks
(2016)
β’ Arxiv
β’ 227 citations
Zhang et al.
-
Image Captioning With Deep Bidirectional Lstms
(2016)
β’ Proceedings of the 24th ACM international conference on Multimedia
β’ 262 citations
Wang et al.
-
Learning To Generalize To New Compositions In Image Understanding
(2016)
β’ Arxiv
β’ 53 citations
Atzmon et al.
-
Modeling Context In Referring Expressions
(2016)
β’ Lecture Notes in Computer Science
β’ 895 citations
Yu et al.
-
Zoneout: Regularizing Rnns By Randomly Preserving Hidden Activations
(2016)
β’ Arxiv
β’ 173 citations
Krueger et al.
-
Embracing Data Abundance: Booktest Dataset For Reading Comprehension
(2016)
β’ Arxiv
β’ 56 citations
Ondrej Bajgar, Rudolf Kadlec, Jan Kleindienst
-
The LAMBADA Dataset: Word Prediction Requiring A Broad Discourse Context
(2016)
β’ Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 99 citations
Paperno et al.
-
MS MARCO: A Human Generated Machine Reading Comprehension Dataset
(2016)
β’ Arxiv
β’ 440 citations
Bajaj et al.
-
RNN Approaches To Text Normalization: A Challenge
(2016)
β’ Arxiv
β’ 55 citations
Richard Sproat, Navdeep Jaitly
-
Visual Genome: Connecting Language And Vision Using Crowdsourced Dense Image Annotations
(2016)
β’ International Journal of Computer Vision
β’ 4911 citations
Krishna et al.
-
A Context-aware Attention Network For Interactive Question Answering
(2016)
β’ Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
β’ 40 citations
Li et al.
-
Title Generation For User Generated Videos
(2016)
β’ Lecture Notes in Computer Science
β’ 65 citations
Zeng et al.
-
Modelling Interaction Of Sentence Pair With Coupled-lstms
(2016)
β’ Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
β’ 43 citations
Pengfei Liu, Xipeng Qiu, Xuanjing Huang
-
Distraction-based Neural Networks For Document Summarization
(2016)
β’ IJCAI 2016
β’ 61 citations
Chen et al.
-
Wikireading: A Novel Large-scale Language Understanding Task Over Wikipedia
(2016)
β’ Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 128 citations
Hewlett et al.
-
Revisiting Visual Question Answering Baselines
(2016)
β’ Lecture Notes in Computer Science
β’ 224 citations
Allan Jabri, Armand Joulin, Laurens van Der Maaten
-
Attentive Explanations: Justifying Decisions And Pointing To The Evidence
(2016)
β’ Arxiv
β’ 55 citations
Park et al.
-
TGIF: A New Dataset And Benchmark On Animated GIF Description
(2016)
β’ 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
β’ 197 citations
Li et al.
-
Towards Sub-word Level Compositions For Sentiment Analysis Of Hindi-english Code Mixed Text
(2016)
β’ Arxiv
β’ 128 citations
Prabhu et al.
-
Text Understanding With The Attention Sum Reader Network
(2016)
β’ Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
β’ 46 citations
Kadlec et al.
-
Pointer Sentinel Mixture Models
(2016)
β’ Arxiv
β’ 481 citations
Merity et al.
-
Tracking The World State With Recurrent Entity Networks
(2016)
β’ ICLR 2017
β’ 157 citations
Henaff et al.
Showing first 12 while collapsed. Click to expand and reveal all 1692.