Publications by Tag
The following tags appear in the publications listed in the review:
ACL Agentic Applications Arxiv Attention Mechanism BERT Bias Mitigation COLING Dataset Distillation Efficiency and Optimization EMNLP Ethics and Bias Evaluation Fairness Few-Shot Fine-Tuning GPT Has Code ICLR ICML In-Context Learning Interpretability and Explainability INTERSPEECH KDD Language Modeling Large-Scale Training LREC Masked Language Model Merging Model Architecture Multimodal Models NeurIPS Pre-Training Prompting Pruning Quantization RAG RecSys Reinforcement Learning Responsible AI Scaling Laws Security SLT Survey Paper TACL Time Series Tokenization Tools Training Techniques Transformer Uncategorized Vector Indexing WMT
Tags
See below a list of all tags and the related papers
🏷 ACL
- A User Simulator For Task-completion Dialogues Xiujun Li et al.
- Non-monotonic Sequential Text Generation Sean Welleck, Kianté Brantley, Hal Iii Daumé, Kyunghyun Cho
- Language Models As Knowledge Bases? Fabio Petroni et al.
- Neural Assistant: Joint Action Prediction, Response Generation, And Latent Knowledge Reasoning Arvind Neelakantan et al.
- Training Neural Response Selection For Task-oriented Dialogue Systems Matthew Henderson et al.
- Transfer Fine-tuning: A BERT Case Study Yuki Arase, Junichi Tsujii
- Transformers As Soft Reasoners Over Language Peter Clark, Oyvind Tafjord, Kyle Richardson
- EDITOR: An Edit-based Transformer With Repositioning For Neural Machine Translation With Soft Lexical Constraints Weijia Xu, Marine Carpuat
- Inducing Language-agnostic Multilingual Representations Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein
- A Token-level Reference-free Hallucination Detection Benchmark For Free-form Text Generation Tianyu Liu et al.
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- Visqa: X-raying Vision And Language Reasoning In Transformers Theo Jaunet et al.
- Dialfred: Dialogue-enabled Agents For Embodied Instruction Following Xiaofeng Gao et al.
- Galactica: A Large Language Model For Science Ross Taylor et al.
- TIARA: Multi-grained Retrieval For Robust Question Answering Over Large Knowledge Bases Yiheng Shu et al.
- Code Generation Tools (almost) For Free? A Study Of Few-shot, Pre-trained Language Models On Code Patrick Bareiß, Beatriz Souza, Marcelo D'amorim, Michael Pradel
- Improving CLIP Training With Language Rewrites Lijie Fan, Dilip Krishnan, Phillip Isola, Dina Katabi, Yonglong Tian
- Increasing Diversity While Maintaining Accuracy: Text Data Generation With Large Language Models And Human Interventions John Joon Young Chung, Ece Kamar, Saleema Amershi
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- Api-bank: A Comprehensive Benchmark For Tool-augmented Llms Minghao Li et al.
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- Generative Speech Recognition Error Correction With Large Language Models And Task-activating Prompting Chao-han Huck Yang et al.
- Hallucination Augmented Contrastive Learning For Multimodal Large Language Model Chaoya Jiang et al.
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- Can Chatgpt And Bard Generate Aligned Assessment Items? A Reliability Analysis Against Human Performance Abdolvahab Khademi
- On Learning To Summarize With Large Language Models As References Yixin Liu et al.
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Materials Science In The Era Of Large Language Models: A Perspective Ge Lei, Ronan Docherty, Samuel J. Cooper
🏷 Agentic
- Deep Active Learning For Dialogue Generation Nabiha Asghar, Pascal Poupart, Xin Jiang, Hang Li
- An Actor-critic Algorithm For Sequence Prediction Dzmitry Bahdanau et al.
- A User Simulator For Task-completion Dialogues Xiujun Li et al.
- Deep Reinforcement Learning For Dialogue Generation Jiwei Li et al.
- A Simple, Fast Diverse Decoding Algorithm For Neural Generation Jiwei Li, Will Monroe, Dan Jurafsky
- Steering Output Style And Topic In Neural Response Generation Di Wang, Nebojsa Jojic, Chris Brockett, Eric Nyberg
- Sample-efficient Actor-critic Reinforcement Learning With Supervised Data For Dialogue Management Pei-hao Su, Pawel Budzianowski, Stefan Ultes, Milica Gasic, Steve Young
- Mojitalk: Generating Emotional Responses At Scale Xianda Zhou, William Yang Wang
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- Adversarial Learning For Neural Dialogue Generation Jiwei Li et al.
- Ask The Right Questions: Active Question Reformulation With Reinforcement Learning Christian Buck et al.
- Data Distillation For Controlling Specificity In Dialogue Generation Jiwei Li, Will Monroe, Dan Jurafsky
- Latent Intention Dialogue Models Tsung-hsien Wen, Yishu Miao, Phil Blunsom, Steve Young
- R\(^3\): Reinforced Reader-ranker For Open-domain Question Answering Shuohang Wang et al.
- End-to-end Optimization Of Goal-driven And Visually Grounded Dialogue Systems Florian Strub et al.
- Batch Policy Gradient Methods For Improving Neural Conversation Models Kirthevasan Kandasamy, Yoram Bachrach, Ryota Tomioka, Daniel Tarlow, David Carter
- Gated-attention Architectures For Task-oriented Language Grounding Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov
- Parlai: A Dialog Research Software Platform Alexander H. Miller et al.
- A Deep Reinforcement Learning Chatbot Iulian V. Serban et al.
- Fine Grained Knowledge Transfer For Personalized Task-oriented Dialogue Systems Kaixiang Mo, Yu Zhang, Qiang Yang, Pascale Fung
- Dialogue Generation: From Imitation Learning To Inverse Reinforcement Learning Ziming Li, Julia Kiseleva, Maarten De Rijke
- Extending Neural Generative Conversational Model Using External Knowledge Sources Prasanna Parthasarathi, Joelle Pineau
- Towards Explainable And Controllable Open Domain Dialogue Generation With Dialogue Acts Can Xu, Wei Wu, Yu Wu
- Babyai: A Platform To Study The Sample Efficiency Of Grounded Language Learning Maxime Chevalier-boisvert et al.
- Wizard Of Wikipedia: Knowledge-powered Conversational Agents Emily Dinan et al.
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Hybrid Retrieval-generation Reinforced Agent For Medical Image Report Generation Christy Y. Li, Xiaodan Liang, Zhiting Hu, Eric P. Xing
- Training Millions Of Personalized Dialogue Agents Pierre-emmanuel Mazaré, Samuel Humeau, Martin Raison, Antoine Bordes
- Zero-shot Adaptive Transfer For Conversational Language Understanding Sungjin Lee, Rahul Jha
- Towards Empathetic Open-domain Conversation Models: A New Benchmark And Dataset Hannah Rashkin, Eric Michael Smith, Margaret Li, Y-lan Boureau
- On Evaluating And Comparing Open Domain Dialog Systems Anu Venkatesh et al.
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Conversational AI: The Science Behind The Alexa Prize Ashwin Ram et al.
- Guiding Policies With Language Via Meta-learning John D. Co-reyes et al.
- Ensemble-based Deep Reinforcement Learning For Chatbots Heriberto Cuayáhuitl et al.
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- Generating Persona Consistent Dialogues By Exploiting Natural Language Inference Haoyu Song, Wei-nan Zhang, Jingwen Hu, Ting Liu
- Say What I Want: Towards The Dark Side Of Neural Dialogue Models Haochen Liu, Tyler Derr, Zitao Liu, Jiliang Tang
- Approximating Interactive Human Evaluation With Self-play For Open-domain Dialog Systems Asma Ghandeharioun et al.
- Consistent Dialogue Generation With Self-supervised Feature Learning Yizhe Zhang et al.
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Transfertransfo: A Transfer Learning Approach For Neural Network Based Conversational Agents Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
- Caire: An Empathetic Neural Chatbot Zhaojiang Lin et al.
- Reinforced Dynamic Reasoning For Conversational Question Generation Boyuan Pan, Hao Li, Ziyu Yao, Deng Cai, Huan Sun
- Learning From Dialogue After Deployment: Feed Yourself, Chatbot! Braden Hancock, Antoine Bordes, Pierre-emmanuel Mazaré, Jason Weston
- Personalizing Dialogue Agents Via Meta-learning Zhaojiang Lin, Andrea Madotto, Chien-sheng Wu, Pascale Fung
- Stabilizing Transformers For Reinforcement Learning Emilio Parisotto et al.
- Learning And Evaluating General Linguistic Intelligence Dani Yogatama et al.
- Deep Learning Based Chatbot Models Richard Csaky
- Reinforcement Learning Based Emotional Editing Constraint Conversation Generation Jia Li, Xiao Sun, Xing Wei, Changliang Li, Jianhua Tao
- Fine-tuning Language Models From Human Preferences Daniel M. Ziegler et al.
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- What Makes A Good Conversation? How Controllable Attributes Affect Human Judgments Abigail See, Stephen Roller, Douwe Kiela, Jason Weston
- Generating Empathetic Responses By Looking Ahead The User's Sentiment Jamin Shin, Peng Xu, Andrea Madotto, Pascale Fung
- Do Neural Dialog Systems Use The Conversation History Effectively? An Empirical Study Chinnadhurai Sankar, Sandeep Subramanian, Christopher Pal, Sarath Chandar, Yoshua Bengio
- Using Natural Language For Reward Shaping In Reinforcement Learning Prasoon Goyal, Scott Niekum, Raymond J. Mooney
- Countering Language Drift Via Visual Grounding Jason Lee, Kyunghyun Cho, Douwe Kiela
- Robust Navigation With Language Pretraining And Stochastic Sampling Xiujun Li et al.
- Modelling Hierarchical Structure Between Dialogue Policy And Natural Language Generator With Option Framework For Task-oriented Dialogue System Jianhong Wang, Yuan Zhang, Tae-kyun Kim, Yunjie Gu
- Artificial Intelligence Versus Maya Angelou: Experimental Evidence That People Cannot Differentiate Ai-generated From Human-written Poetry Nils Köbis, Luca Mossink
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Towards Learning A Generic Agent For Vision-and-language Navigation Via Pre-training Weituo Hao, Chunyuan Li, Xiujun Li, Lawrence Carin, Jianfeng Gao
- Low-resource Knowledge-grounded Dialogue Generation Xueliang Zhao et al.
- Alfworld: Aligning Text And Embodied Environments For Interactive Learning Mohit Shridhar et al.
- Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-initiative Conversations Ashwin Paranjape et al.
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- Grounding Language To Autonomously-acquired Skills Via Goal Generation Ahmed Akakzia, Cédric Colas, Pierre-yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
- Countering Language Drift With Seeded Iterated Learning Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- Human Instruction-following With Deep Reinforcement Learning Via Transfer-learning From Text Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- Grounded Language Learning Fast And Slow Felix Hill et al.
- Collaborative Storytelling With Large-scale Neural Language Models Eric Nichols, Leo Gao, Randy Gomez
- Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Ruibo Liu et al.
- Can You Put It All Together: Evaluating Conversational Agents' Ability To Blend Skills Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-lan Boureau
- Controlling Style In Generated Dialogue Eric Michael Smith, Diana Gonzalez-rico, Emily Dinan, Y-lan Boureau
- Will I Sound Like Me? Improving Persona Consistency In Dialogues Through Pragmatic Self-consciousness Hyunwoo Kim, Byeongchang Kim, Gunhee Kim
- Generate Natural Language Explanations For Recommendation Hanxiong Chen, Xu Chen, Shaoyun Shi, Yongfeng Zhang
- Crossing The Conversational Chasm: A Primer On Natural Language Processing For Multilingual Task-oriented Dialogue Systems Evgeniia Razumovskaia et al.
- Multimodal Dialogue Response Generation Qingfeng Sun et al.
- Bob: BERT Over BERT For Training Persona-based Dialogue Models From Limited Personalized Data Haoyu Song, Yan Wang, Kaiyan Zhang, Wei-nan Zhang, Ting Liu
- Bitod: A Bilingual Multi-domain Dataset For Task-oriented Dialogue Modeling Zhaojiang Lin et al.
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- Internet-augmented Dialogue Generation Mojtaba Komeili, Kurt Shuster, Jason Weston
- Multimodal Transformer With Variable-length Memory For Vision-and-language Navigation Chuang Lin et al.
- A Short Survey Of Pre-trained Language Models For Conversational AI-A Newage In NLP Munazza Zaib, Quan Z. Sheng, Wei Emma Zhang
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- Hierarchical Task Learning From Language Instructions With Unified Transformers And Self-monitoring Yichi Zhang, Joyce Chai
- Episodic Transformer For Vision-and-language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun
- Embodied BERT: A Transformer Model For Embodied, Language-guided Visual Task Completion Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Webshop: Towards Scalable Real-world Web Interaction With Grounded Language Agents Shunyu Yao, Howard Chen, John Yang, Karthik Narasimhan
- Coderl: Mastering Code Generation Through Pretrained Models And Deep Reinforcement Learning Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi
- Inner Monologue: Embodied Reasoning Through Planning With Language Models Wenlong Huang et al.
- Language Models As Zero-shot Planners: Extracting Actionable Knowledge For Embodied Agents Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch
- Dialfred: Dialogue-enabled Agents For Embodied Instruction Following Xiaofeng Gao et al.
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Planbench: An Extensible Benchmark For Evaluating Large Language Models On Planning And Reasoning About Change Karthik Valmeekam, Matthew Marquez, Alberto Olmo, Sarath Sreedharan, Subbarao Kambhampati
- Evolution Through Large Models Joel Lehman et al.
- React: Synergizing Reasoning And Acting In Language Models Shunyu Yao et al.
- Teaching Language Models To Support Answers With Verified Quotes Jacob Menick et al.
- Language Models As Agent Models Jacob Andreas
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- The Goldilocks Of Pragmatic Understanding: Fine-tuning Strategy Matters For Implicature Resolution By Llms Laura Ruis et al.
- Blenderbot 3: A Deployed Conversational Agent That Continually Learns To Responsibly Engage Kurt Shuster et al.
- Do As I Can, Not As I Say: Grounding Language In Robotic Affordances Michael Ahn et al.
- Is Reinforcement Learning (not) For Natural Language Processing: Benchmarks, Baselines, And Building Blocks For Natural Language Policy Optimization Rajkumar Ramamurthy et al.
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Red Teaming Language Models With Language Models Ethan Perez et al.
- Quark: Controllable Text Generation With Reinforced Unlearning Ximing Lu et al.
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Future Transformer For Long-term Action Anticipation Dayoung Gong, Joonseok Lee, Manjin Kim, Seong Jong Ha, Minsu Cho
- Language And Culture Internalisation For Human-like Autotelic AI Cédric Colas, Tristan Karch, Clément Moulin-frier, Pierre-yves Oudeyer
- Llm-planner: Few-shot Grounded Planning For Embodied Agents With Large Language Models Chan Hee Song et al.
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Enabling Conversational Interaction With Mobile UI Using Large Language Models Bryan Wang, Gang Li, Yang Li
- Multimodal Knowledge Alignment With Reinforcement Learning Youngjae Yu et al.
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- The AI Teacher Test: Measuring The Pedagogical Ability Of Blender And GPT-3 In Educational Dialogues Anaïs Tack, Chris Piech
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- Improving Alignment Of Dialogue Agents Via Targeted Human Judgements Amelia Glaese et al.
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- A Model-agnostic Data Manipulation Method For Persona-based Dialogue Generation Yu Cao, Wei Bi, Meng Fang, Shuming Shi, Dacheng Tao
- Storydall-e: Adapting Pretrained Text-to-image Transformers For Story Continuation Adyasha Maharana, Darryl Hannan, Mohit Bansal
- A New Path: Scaling Vision-and-language Navigation With Synthetic Instructions And Imitation Learning Aishwarya Kamath et al.
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Contrastive Learning Reduces Hallucination In Conversations Weiwei Sun et al.
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- Meta Policy Learning For Cold-start Conversational Recommendation Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu
- Leancontext: Cost-efficient Domain-specific Question Answering Using Llms Md Adnan Arefeen, Biplob Debnath, Srimat Chakradhar
- Unleashing The Emergent Cognitive Synergy In Large Language Models: A Task-solving Agent Through Multi-persona Self-collaboration Zhenhailong Wang et al.
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Just Ask For Calibration: Strategies For Eliciting Calibrated Confidence Scores From Language Models Fine-tuned With Human Feedback Katherine Tian et al.
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- Agentcf: Collaborative Learning With Autonomous Language Agents For Recommender Systems Junjie Zhang et al.
- Qwen Technical Report Jinze Bai et al.
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- Llm-grounder: Open-vocabulary 3D Visual Grounding With Large Language Model As An Agent Jianing Yang et al.
- Paperqa: Retrieval-augmented Generative Agent For Scientific Research Jakub Lála et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Theory Of Mind For Multi-agent Collaboration Via Large Language Models Huao Li et al.
- "it's A Fair Game", Or Is It? Examining How Users Navigate Disclosure Risks And Benefits When Using Llm-based Conversational Agents Zhiping Zhang et al.
- Building Cooperative Embodied Agents Modularly With Large Language Models Hongxin Zhang et al.
- Llmind: Orchestrating AI And Iot With LLM For Complex Task Execution Hongwei Cui, Yuyang Du, Qun Yang, Yulin Shao, Soung Chang Liew
- Boosting Theory-of-mind Performance In Large Language Models Via Prompting Shima Rahimi Moghaddam, Christopher J. Honey
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- Generating Phishing Attacks Using Chatgpt Sayak Saha Roy, Krishna Vamsi Naragam, Shirin Nilizadeh
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Automatic Prompt Optimization With "gradient Descent" And Beam Search Reid Pryzant et al.
- VELMA: Verbalization Embodiment Of LLM Agents For Vision And Language Navigation In Street View Raphael Schumann et al.
- Can We Trust The Evaluation On Chatgpt? Rachith Aiyappa, Jisun An, Haewoon Kwak, Yong-yeol Ahn
- Direct Preference Optimization: Your Language Model Is Secretly A Reward Model Rafael Rafailov et al.
- Autogen: Enabling Next-gen LLM Applications Via Multi-agent Conversation Qingyun Wu et al.
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Do Llms Understand Social Knowledge? Evaluating The Sociability Of Large Language Models With Socket Benchmark Minje Choi, Jiaxin Pei, Sagar Kumar, Chang Shu, David Jurgens
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Grounding Large Language Models In Interactive Environments With Online Reinforcement Learning Thomas Carta et al.
- Cognitive Architectures For Language Agents Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths
- Deception Abilities Emerged In Large Language Models Thilo Hagendorff
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Metagpt: Meta Programming For A Multi-agent Collaborative Framework Sirui Hong et al.
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Roco: Dialectic Multi-robot Collaboration With Large Language Models Zhao Mandi, Shreeya Jain, Shuran Song
- Chain Of Hindsight Aligns Language Models With Feedback Hao Liu, Carmelo Sferrazza, Pieter Abbeel
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Personallm: Investigating The Ability Of Large Language Models To Express Personality Traits Hang Jiang et al.
- Wizardmath: Empowering Mathematical Reasoning For Large Language Models Via Reinforced Evol-instruct Haipeng Luo et al.
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- Navgpt: Explicit Reasoning In Vision-and-language Navigation With Large Language Models Gengze Zhou, Yicong Hong, Qi Wu
- Do Large Language Models Show Decision Heuristics Similar To Humans? A Case Study Using GPT-3.5 Gaurav Suri, Lily R. Slater, Ali Ziaee, Morgan Nguyen
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Weak-to-strong Generalization: Eliciting Strong Capabilities With Weak Supervision Collin Burns et al.
- LIMA: Less Is More For Alignment Chunting Zhou et al.
- Drivelm: Driving With Graph Visual Question Answering Chonghao Sima et al.
- Whitefox: White-box Compiler Fuzzing Empowered By Large Language Models Chenyuan Yang et al.
- Chateval: Towards Better Llm-based Evaluators Through Multi-agent Debate Chi-min Chan et al.
- Chatdev: Communicative Agents For Software Development Chen Qian et al.
- Memgpt: Towards Llms As Operating Systems Charles Packer et al.
- Reinforced Self-training (rest) For Language Modeling Caglar Gulcehre et al.
- A Short Survey Of Viewing Large Language Models In Legal Aspect Zhongxiang Sun
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- Chatgpt: Applications, Opportunities, And Threats Aram Bahrini et al.
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Chemcrow: Augmenting Large-language Models With Chemistry Tools Andres M Bran et al.
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- On Generative Agents In Recommendation An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-seng Chua
- Self-refine: Iterative Refinement With Self-feedback Aman Madaan et al.
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- Better To Ask In English: Cross-lingual Evaluation Of Large Language Models For Healthcare Queries Yiqiao Jin et al.
- Improving Factuality And Reasoning In Language Models Through Multiagent Debate Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- Hugginggpt: Solving AI Tasks With Chatgpt And Its Friends In Hugging Face Yongliang Shen et al.
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- March In Chat: Interactive Prompting For Remote Embodied Referring Expression Yanyuan Qiao, Yuankai Qi, Zheng Yu, Jing Liu, Qi Wu
- Recmind: Large Language Model Powered Agent For Recommendation Yancheng Wang et al.
- Chat With The Environment: Interactive Multimodal Perception Using Large Language Models Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
- Improving Language Model Negotiation With Self-play And In-context Learning From AI Feedback Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Medagents: Large Language Models As Collaborators For Zero-shot Medical Reasoning Xiangru Tang et al.
- Cogagent: A Visual Language Model For GUI Agents Wenyi Hong et al.
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- Describe, Explain, Plan And Select: Interactive Planning With Large Language Models Enables Open-world Multi-task Agents Zihao Wang et al.
- Character-llm: A Trainable Agent For Role-playing Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- When Large Language Model Agents Meet 6G Networks: Perception, Grounding, And Alignment Minrui Xu et al.
- History Of Generative Artificial Intelligence (AI) Chatbots: Past, Present, And Future Development Md. Al-amin et al.
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- Clochat: Understanding How People Customize, Interact, And Experience Personas In Large Language Models Juhye Ha, Hyeon Jeon, Daeun Han, Jinwook Seo, Changhoon Oh
- Building Better AI Agents: A Provocation On The Utilisation Of Persona In Llm-based Conversational Agents Guangzhi Sun, Xiao Zhan, Jose Such
- Deepseek-v2: A Strong, Economical, And Efficient Mixture-of-experts Language Model Deepseek-ai et al.
- Understanding Large-language Model (llm)-powered Human-robot Interaction Callie Y. Kim, Christine P. Lee, Bilge Mutlu
- Autocoderover: Autonomous Program Improvement Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, Abhik Roychoudhury
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- Deepseek-r1: Incentivizing Reasoning Capability In Llms Via Reinforcement Learning Deepseek-ai et al.
🏷 Applications
- Generative Deep Neural Networks For Dialogue: A Short Review Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, Joelle Pineau
- Steering Output Style And Topic In Neural Response Generation Di Wang, Nebojsa Jojic, Chris Brockett, Eric Nyberg
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Seq2seq-vis: A Visual Debugging Tool For Sequence-to-sequence Models Hendrik Strobelt et al.
- Disentangling Language And Knowledge In Task-oriented Dialogs Dinesh Raghu, Nikhil Gupta, Mausam
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Nemo: A Toolkit For Building AI Applications Using Neural Modules Oleksii Kuchaiev et al.
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Mixture Content Selection For Diverse Sequence Generation Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
- Say What I Want: Towards The Dark Side Of Neural Dialogue Models Haochen Liu, Tyler Derr, Zitao Liu, Jiliang Tang
- Codegru: Context-aware Deep Learning With Gated Recurrent Unit For Source Code Modeling Yasir Hussain, Zhiqiu Huang, Yu Zhou, Senzhang Wang
- Learning From Explanations With Neural Execution Tree Ziqi Wang et al.
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Plug And Play Language Models: A Simple Approach To Controlled Text Generation Sumanth Dathathri et al.
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- Transfer Fine-tuning: A BERT Case Study Yuki Arase, Junichi Tsujii
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- Controlled Hallucinations: Learning To Generate Faithfully From Noisy Data Katja Filippova
- Linformer: Self-attention With Linear Complexity Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, Hao Ma
- Phobert: Pre-trained Language Models For Vietnamese Dat Quoc Nguyen, Anh Tuan Nguyen
- Pymt5: Multi-mode Translation Of Natural Language And Python Code With Transformers Colin B. Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, Neel Sundaresan
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Deebert: Dynamic Early Exiting For Accelerating BERT Inference Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Compressing Large-scale Transformer-based Models: A Case Study On BERT Prakhar Ganesh et al.
- Gshard: Scaling Giant Models With Conditional Computation And Automatic Sharding Dmitry Lepikhin et al.
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- Few-shot Natural Language Generation For Task-oriented Dialog Baolin Peng et al.
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- GREEK-BERT: The Greeks Visiting Sesame Street John Koutsikakis, Ilias Chalkidis, Prodromos Malakasiotis, Ion Androutsopoulos
- Measuring And Reducing Gendered Correlations In Pre-trained Models Kellie Webster et al.
- The Cascade Transformer: An Application For Efficient Answer Sentence Selection Luca Soldaini, Alessandro Moschitti
- A Survey Of Knowledge-enhanced Text Generation Wenhao Yu et al.
- XTREME: A Massively Multilingual Multi-task Benchmark For Evaluating Cross-lingual Generalization Junjie Hu et al.
- Big Bird: Transformers For Longer Sequences Manzil Zaheer et al.
- Continual Learning For Natural Language Generation In Task-oriented Dialog Systems Fei Mi, Liangwei Chen, Mengjie Zhao, Minlie Huang, Boi Faltings
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Aragpt2: Pre-trained Transformer For Arabic Language Generation Wissam Antoun, Fady Baly, Hazem Hajj
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- Bartscore: Evaluating Generated Text As Text Generation Weizhe Yuan, Graham Neubig, Pengfei Liu
- Prompt Programming For Large Language Models: Beyond The Few-shot Paradigm Laria Reynolds, Kyle Mcdonell
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- One Chatbot Per Person: Creating Personalized Chatbots Based On Implicit User Profiles Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, Ji-rong Wen
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- True Few-shot Learning With Prompts -- A Real-world Perspective Timo Schick, Hinrich Schütze
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- Counterfactual Memorization In Neural Language Models Chiyuan Zhang et al.
- A Token-level Reference-free Hallucination Detection Benchmark For Free-form Text Generation Tianyu Liu et al.
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners Ningyu Zhang et al.
- AI Chains: Transparent And Controllable Human-ai Interaction By Chaining Large Language Model Prompts Tongshuang Wu, Michael Terry, Carrie J. Cai
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Human Parity On Commonsenseqa: Augmenting Self-attention With External Attention Yichong Xu et al.
- Wordcraft: A Human-ai Collaborative Editor For Story Writing Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Dialogue History Matters! Personalized Response Selectionin Multi-turn Retrieval-based Chatbots Juntao Li et al.
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- Interactive And Visual Prompt Engineering For Ad-hoc Task Adaptation With Large Language Models Hendrik Strobelt et al.
- What Do Llms Know About Financial Markets? A Case Study On Reddit Market Sentiment Analysis Xiang Deng, Vasilisa Bashlovkina, Feng Han, Simon Baumgartner, Michael Bendersky
- Reasoning With Language Model Prompting: A Survey Shuofei Qiao et al.
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- How To Prompt? Opportunities And Challenges Of Zero- And Few-shot Learning For Human-ai Interaction In Creative Applications Of Generative Models Hai Dang, Lukas Mecke, Florian Lehmann, Sven Goller, Daniel Buschek
- Using Large Language Models To Simulate Multiple Humans And Replicate Human Subject Studies Gati Aher, Rosa I. Arriaga, Adam Tauman Kalai
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Large Language Models Encode Clinical Knowledge Karan Singhal et al.
- Generating Sequences By Learning To Self-correct Sean Welleck et al.
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- When And Why Vision-language Models Behave Like Bags-of-words, And What To Do About It? Mert Yuksekgonul, Federico Bianchi, Pratyusha Kalluri, Dan Jurafsky, James Zou
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- BLOOM: A 176b-parameter Open-access Multilingual Language Model Bigscience Workshop et al.
- Recurrent Memory Transformer Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
- Socratic Models: Composing Zero-shot Multimodal Reasoning With Language Andy Zeng et al.
- Compositional Semantic Parsing With Large Language Models Andrew Drozdov et al.
- Prompt-to-prompt Image Editing With Cross Attention Control Amir Hertz et al.
- Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier
- Standing On The Shoulders Of Giant Frozen Language Models Yoav Levine et al.
- Evaluating Human-language Model Interaction Mina Lee et al.
- Make-a-video: Text-to-video Generation Without Text-video Data Uriel Singer et al.
- 3DALL-E: Integrating Text-to-image AI In 3D Design Workflows Vivian Liu, Jo Vermeulen, George Fitzmaurice, Justin Matejka
- Holistic Evaluation Of Language Models Percy Liang et al.
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets Md Tahmid Rahman Laskar et al.
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- Natural Language Generation And Understanding Of Big Code For Ai-assisted Programming: A Review Man Fai Wong, Shangxin Guo, Ching Nam Hang, Siu Wai Ho, Chee Wei Tan
- A Bibliometric Review Of Large Language Models Research From 2017 To 2023 Lizhou Fan et al.
- Give Us The Facts: Enhancing Large Language Models With Knowledge Graphs For Fact-aware Language Modeling Linyao Yang, Hongyang Chen, Zhao Li, Xiao Ding, Xindong Wu
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- Automatically Correcting Large Language Models: Surveying The Landscape Of Diverse Self-correction Strategies Liangming Pan et al.
- Superclue: A Comprehensive Chinese Large Language Model Benchmark Liang Xu et al.
- 14 Examples Of How Llms Can Transform Materials Science And Chemistry: A Reflection On A Large Language Model Hackathon Kevin Maik Jablonka et al.
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Automatic Prompt Augmentation And Selection With Chain-of-thought From Labeled Data Kashun Shum, Shizhe Diao, Tong Zhang
- Towards Expert-level Medical Question Answering With Large Language Models Karan Singhal et al.
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- Ai-augmented Surveys: Leveraging Large Language Models And Surveys For Opinion Prediction Junsol Kim, Byungkyu Lee
- Evaluating GPT-4 And Chatgpt On Japanese Medical Licensing Examinations Jungo Kasai, Yuhei Kasai, Keisuke Sakaguchi, Yutaro Yamada, Dragomir Radev
- Backdooring Instruction-tuned Large Language Models With Virtual Prompt Injection Jun Yan et al.
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Qwen Technical Report Jinze Bai et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- Large Language Models Cannot Self-correct Reasoning Yet Jie Huang et al.
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- Geotechnical Parrot Tales (GPT): Harnessing Large Language Models In Geotechnical Engineering Krishna Kumar
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Ethical Chatgpt: Concerns, Challenges, And Commandments Jianlong Zhou, Heimo Müller, Andreas Holzinger, Fang Chen
- The Impact Of Chatgpt And Llms On Medical Imaging Stakeholders: Perspectives And Use Cases Jiancheng Yang, Hongwei Bran Li, Donglai Wei
- Large Language Models In Medicine: The Potentials And Pitfalls Jesutofunmi A. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou
- AWQ: Activation-aware Weight Quantization For LLM Compression And Acceleration Ji Lin et al.
- Leveraging Large Language Models For Sequential Recommendation Jesse Harte et al.
- Challenges And Applications Of Large Language Models Jean Kaddour et al.
- Auditing Large Language Models: A Three-layered Approach Jakob Mökander, Jonas Schuett, Hannah Rose Kirk, Luciano Floridi
- Chatgpt In The Classroom: An Analysis Of Its Strengths And Weaknesses For Solving Undergraduate Computer Science Questions Ishika Joshi et al.
- Factuality Challenges In The Era Of Large Language Models Isabelle Augenstein et al.
- "it's Not Like Jarvis, But It's Pretty Close!" -- Examining Chatgpt's Usage Among Undergraduate Students In Computer Science Ishika Joshi, Ritvik Budhiraja, Harshal D Akolekar, Jagat Sesh Challa, Dhruv Kumar
- Muse: Text-to-image Generation Via Masked Generative Transformers Huiwen Chang et al.
- Llama 2: Open Foundation And Fine-tuned Chat Models Hugo Touvron et al.
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- Fingpt: Open-source Financial Large Language Models Hongyang Yang, Xiao-yang Liu, Christina Dan Wang
- Bioinstruct: Instruction Tuning Of Large Language Models For Biomedical Natural Language Processing Hieu Tran, Zhichao Yang, Zonghai Yao, Hong Yu
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Instruction Tuning For Large Language Models: A Survey Shengyu Zhang et al.
- Scaling Vision-language Models With Sparse Mixture Of Experts Sheng Shen et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- A Comparative Study Of Open-source Large Language Models, GPT-4 And Claude 2: Multiple-choice Test Taking In Nephrology Sean Wu et al.
- Chatgpt Or Human? Detect And Explain. Explaining Decisions Of Machine Learning Model For Detecting Short Chatgpt-generated Text Sandra Mitrović, Davide Andreoletti, Omran Ayoub
- Let's Have A Chat! A Conversation With Chatgpt: Technology, Applications, And Limitations Sakib Shahriar, Kadhim Hayawi
- Verify-and-edit: A Knowledge-enhanced Chain-of-thought Framework Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, Lidong Bing
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- A Universal Question-answering Platform For Knowledge Graphs Reham Omar, Ishika Dhall, Panos Kalnis, Essam Mansour
- Autogen: Enabling Next-gen LLM Applications Via Multi-agent Conversation Qingyun Wu et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- AI Transparency In The Age Of Llms: A Human-centered Research Roadmap Q. Vera Liao, Jennifer Wortman Vaughan
- Students' Perceptions And Preferences Of Generative Artificial Intelligence Feedback For Programming Zhengdong Zhang et al.
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Git-mol: A Multi-modal Large Language Model For Molecular Science With Graph, Image, And Text Pengfei Liu, Yiming Ren, Jun Tao, Zhixiang Ren
- Audiopalm: A Large Language Model That Can Speak And Listen Paul K. Rubenstein et al.
- Going Beyond Nouns With Vision & Language Models Using Synthetic Data Paola Cascante-bonilla et al.
- Hallucinations In Large Multilingual Translation Models Nuno M. Guerreiro et al.
- CAT-LM: Training Language Models On Aligned Code And Tests Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn
- A Stitch In Time Saves Nine: Detecting And Mitigating Hallucinations Of Llms By Validating Low-confidence Generation Neeraj Varshney, Wenlin Yao, Hongming Zhang, Jianshu Chen, Dong Yu
- Exploring The Potential Of Large Language Models To Generate Formative Programming Feedback Natalie Kiesler, Dominic Lohr, Hieke Keuning
- State Of What Art? A Call For Multi-prompt LLM Evaluation Moran Mizrahi et al.
- A Review Of Chatgpt Applications In Education, Marketing, Software Engineering, And Healthcare: Benefits, Drawbacks, And Research Directions Mohammad Fraiwan, Natheer Khasawneh
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Spqr: A Sparse-quantized Representation For Near-lossless LLM Weight Compression Tim Dettmers et al.
- Medalpaca -- An Open-source Collection Of Medical Conversational AI Models And Training Data Tianyu Han et al.
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Red Teaming Chatgpt Via Jailbreaking: Bias, Robustness, Reliability And Toxicity Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Evallm: Interactive Evaluation Of Large Language Model Prompts On User-defined Criteria Tae Soo Kim, Yoonjoo Lee, Jamin Shin, Young-ho Kim, Juho Kim
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- On The Possibilities Of Ai-generated Text Detection Souradip Chakraborty et al.
- LL3DA: Visual Interactive Instruction Tuning For Omni-3d Understanding, Reasoning, And Planning Sijin Chen et al.
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Chatgpt Beyond English: Towards A Comprehensive Evaluation Of Large Language Models In Multilingual Learning Viet Dac Lai et al.
- Is GPT-4 A Reliable Rater? Evaluating Consistency In GPT-4 Text Ratings Veronika Hackl, Alexandra Elena Müller, Michael Granitzer, Maximilian Sailer
- Can Ai-generated Text Be Reliably Detected? Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, Soheil Feizi
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Nemo Guardrails: A Toolkit For Controllable And Safe LLM Applications With Programmable Rails Traian Rebedea, Razvan Dinu, Makesh Sreedhar, Christopher Parisien, Jonathan Cohen
- Promptcblue: A Chinese Prompt Tuning Benchmark For The Medical Domain Wei Zhu, Xiaoling Wang, Huanran Zheng, Mosha Chen, Buzhou Tang
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- BLIVA: A Simple Multimodal LLM For Better Handling Of Text-rich Visual Questions Wenbo Hu et al.
- Chatgraph: Interpretable Text Classification By Converting Chatgpt Knowledge To Graphs Yucheng Shi et al.
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Llm-rec: Personalized Recommendation Via Prompting Large Language Models Hanjia Lyu et al.
- Personallm: Investigating The Ability Of Large Language Models To Express Personality Traits Hang Jiang et al.
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Gemini: A Family Of Highly Capable Multimodal Models Gemini Team et al.
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Chatgpt Outperforms Crowd-workers For Text-annotation Tasks Fabrizio Gilardi, Meysam Alizadeh, Maël Kubli
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Kosmos-2: Grounding Multimodal Large Language Models To The World Zhiliang Peng et al.
- Simulating H.P. Lovecraft Horror Literature With The Chatgpt Large Language Model Eduardo C. Garrido-merchán, José Luis Arroyo-barrigüete, Roberto Gozalo-brizuela
- Text-to-sql Empowered By Large Language Models: A Benchmark Evaluation Dawei Gao et al.
- Almanac: Retrieval-augmented Language Models For Clinical Medicine Cyril Zakka et al.
- Debiasing Vision-language Models Via Biased Prompts Ching-yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- K2: A Foundation Language Model For Geoscience Knowledge Understanding And Utilization Cheng Deng et al.
- Llmseceval: A Dataset Of Natural Language Prompts For Security Evaluations Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, Riccardo Scandariato
- Pmc-llama: Towards Building Open-source Language Models For Medicine Chaoyi Wu et al.
- Large Language Models On Graphs: A Comprehensive Survey Bowen Jin et al.
- A Short Survey Of Viewing Large Language Models In Legal Aspect Zhongxiang Sun
- Evaluation Of Chatgpt For Nlp-based Mental Health Applications Bishal Lamichhane
- Friend Or Foe? Exploring The Implications Of Large Language Models On The Science System Benedikt Fecher, Marcel Hebing, Melissa Laufer, Jörg Pohle, Fabian Sofsky
- Code Llama: Open Foundation Models For Code Baptiste Rozière et al.
- Check Your Facts And Try Again: Improving Large Language Models With External Knowledge And Automated Feedback Baolin Peng et al.
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- Scaling Transformer To 1M Tokens And Beyond With RMT Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, Mikhail S. Burtsev
- Chatgpt: Applications, Opportunities, And Threats Aram Bahrini et al.
- Med-halt: Medical Domain Hallucination Test For Large Language Models Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- Chemcrow: Augmenting Large-language Models With Chemistry Tools Andres M Bran et al.
- Generative AI: Implications And Applications For Education Anastasia Olnancy Olga et al.
- Chatgpt Is A Remarkable Tool -- For Experts Amos Azaria, Rina Azoulay, Shulamit Reches
- Fighting Fire With Fire: Can Chatgpt Detect Ai-generated Text? Amrita Bhattacharjee, Huan Liu
- Large Language Models For Telecom: Forthcoming Impact On The Industry Ali Maatouk, Nicola Piovesan, Fadhel Ayed, Antonio De Domenico, Merouane Debbah
- The (ab)use Of Open Source Code To Train Large Language Models Ali Al-kaswan, Maliheh Izadi
- Model Tuning Or Prompt Tuning? A Study Of Large Language Models For Clinical Concept And Relation Extraction Cheng Peng et al.
- What Does CLIP Know About A Red Circle? Visual Prompt Engineering For Vlms Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- Can Chatgpt And Bard Generate Aligned Assessment Items? A Reliability Analysis Against Human Performance Abdolvahab Khademi
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- NL2TL: Transforming Natural Languages To Temporal Logics Using Large Language Models Yongchao Chen, Rujul Gandhi, Yang Zhang, Chuchu Fan
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Emotional Intelligence Of Large Language Models Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Liu Jia
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- HPC-GPT: Integrating Large Language Model For High-performance Computing Xianzhong Ding et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
- Large Language Models In Education: Vision And Opportunities Wensheng Gan, Zhenlian Qi, Jiayang Wu, Jerry Chun-wei Lin
- Generative Recommendation: Towards Next-generation Recommender Paradigm Wenjie Wang, Xinyu Lin, Fuli Feng, Xiangnan He, Tat-seng Chua
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Longbench: A Bilingual, Multitask Benchmark For Long Context Understanding Yushi Bai et al.
- Exploring The Impact Of Instruction Data Scaling On Large Language Models: An Empirical Study On Real-world Use Cases Yunjie Ji et al.
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Large Language Models In Healthcare And Medical Domain: A Review Zabir Al Nazi, Wei Peng
- Hard Prompts Made Easy: Gradient-based Discrete Optimization For Prompt Tuning And Discovery Yuxin Wen et al.
- Chatbot Arena: An Open Platform For Evaluating Llms By Human Preference Wei-lin Chiang et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- Large Language Models And Games: A Survey And Roadmap Roberto Gallotta et al.
- Me Llama: Foundation Large Language Models For Medical Applications Qianqian Xie et al.
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- From Text To Transformation: A Comprehensive Review Of Large Language Models' Versatility Pravneet Kaur et al.
- AI Hallucinations: A Misnomer Worth Clarifying Negar Maleki, Balaji Padmanabhan, Kaushik Dutta
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- The Dawn After The Dark: An Empirical Study On Factuality Hallucination In Large Language Models Junyi Li et al.
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Feedback-generation For Programming Exercises With GPT-4 Imen Azaiz, Natalie Kiesler, Sven Strickroth
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Fine Tuning Vs. Retrieval Augmented Generation For Less Popular Knowledge Heydar Soudani, Evangelos Kanoulas, Faegheh Hasibi
- Building Better AI Agents: A Provocation On The Utilisation Of Persona In Llm-based Conversational Agents Guangzhi Sun, Xiao Zhan, Jose Such
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context Gemini Team et al.
- Large Language Models In Cybersecurity: State-of-the-art Farzad Nourmohammadzadeh Motlagh et al.
- Chemllm: A Chemical Large Language Model Di Zhang et al.
- Recent Advances In Generative AI And Large Language Models: Current Status, Challenges, And Perspectives Desta Haileselassie Hagos, Rick Battle, Danda B. Rawat
- The Revolution Of Multimodal Large Language Models: A Survey Davide Caffagni et al.
- Rethinking Interpretability In The Era Of Large Language Models Chandan Singh, Jeevana Priya Inala, Michel Galley, Rich Caruana, Jianfeng Gao
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- AI And Memory Wall Amir Gholami et al.
- Optimization Methods For Personalizing Large Language Models Through Retrieval Augmentation Alireza Salemi, Surya Kallumadi, Hamed Zamani
- Large Language Models For Data Annotation And Synthesis: A Survey Zhen Tan et al.
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- Sora: A Review On Background, Technology, Limitations, And Opportunities Of Large Vision Models Yixin Liu et al.
- Large Language Models In Mental Health Care: A Scoping Review Yining Hua et al.
- Biomistral: A Collection Of Open-source Pretrained Large Language Models For Medical Domains Yanis Labrak et al.
- Mgte: Generalized Long-context Text Representation And Reranking Models For Multilingual Text Retrieval Xin Zhang et al.
- A Survey On RAG Meeting Llms: Towards Retrieval-augmented Large Language Models Wenqi Fan et al.
- CRUD-RAG: A Comprehensive Chinese Benchmark For Retrieval-augmented Generation Of Large Language Models Yuanjie Lyu et al.
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
🏷 Arxiv
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- Long-span Summarization Via Local Attention And Content Selection Potsawee Manakul, Mark J. F. Gales
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- Codegen: An Open Large Language Model For Code With Multi-turn Program Synthesis Erik Nijkamp et al.
- Block-recurrent Transformers Delesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur
- Memorizing Transformers Yuhuai Wu, Markus N. Rabe, Delesley Hutchins, Christian Szegedy
- Leancontext: Cost-efficient Domain-specific Question Answering Using Llms Md Adnan Arefeen, Biplob Debnath, Srimat Chakradhar
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- CORE-GPT: Combining Open Access Research And Large Language Models For Credible, Trustworthy Question Answering David Pride, Matteo Cancellieri, Petr Knoth
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- Large Language Models In Mental Health Care: A Scoping Review Yining Hua et al.
🏷 Attention Mechanism
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- Topic Aware Neural Response Generation Chen Xing et al.
- Attention Strategies For Multi-source Sequence-to-sequence Learning Jindřich Libovický, Jindřich Helcl
- Attention Is All You Need Ashish Vaswani et al.
- Weighted Transformer Network For Machine Translation Karim Ahmed, Nitish Shirish Keskar, Richard Socher
- Phase Conductor On Multi-layered Attentions For Machine Comprehension Rui Liu, Wei Wei, Weiguang Mao, Maria Chikina
- Gated-attention Architectures For Task-oriented Language Grounding Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov
- Frustratingly Short Attention Spans In Neural Language Modeling Michał Daniluk, Tim Rocktäschel, Johannes Welbl, Sebastian Riedel
- A Unified Query-based Generative Model For Question Generation And Question Answering Linfeng Song, Zhiguo Wang, Wael Hamza
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- Character-level Language Modeling With Deeper Self-attention Rami Al-rfou, Dokook Choe, Noah Constant, Mandy Guo, Llion Jones
- Commonsense For Generative Multi-hop Question Answering Tasks Lisa Bauer, Yicheng Wang, Mohit Bansal
- Sdnet: Contextualized Attention-based Deep Network For Conversational Question Answering Chenguang Zhu, Michael Zeng, Xuedong Huang
- An Affect-rich Neural Conversational Model With Biased Attention And Weighted Cross-entropy Loss Peixiang Zhong, Di Wang, Chunyan Miao
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Topic-based Evaluation For Conversational Bots Fenfei Guo et al.
- Multi-cast Attention Networks For Retrieval-based Question Answering And Response Prediction Yi Tay, Luu Anh Tuan, Siu Cheung Hui
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- Recosa: Detecting The Relevant Contexts With Self-attention For Multi-turn Dialogue Generation Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Entity-consistent End-to-end Task-oriented Dialogue System With KB Retriever Libo Qin et al.
- Attention Is Not Explanation Sarthak Jain, Byron C. Wallace
- Revealing The Dark Secrets Of BERT Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky
- Understanding The Behaviors Of BERT In Ranking Yifan Qiao, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu
- Data-to-text Generation With Entity Modeling Ratish Puduppully, Li Dong, Mirella Lapata
- Generating Persona Consistent Dialogues By Exploiting Natural Language Inference Haoyu Song, Wei-nan Zhang, Jingwen Hu, Ting Liu
- Contextualized Sparse Representations For Real-time Open-domain Question Answering Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- Dialogue Transformers Vladimir Vlasov, Johannes E. M. Mosig, Alan Nichol
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- A Tensorized Transformer For Language Modeling Xindian Ma et al.
- Pay Less Attention With Lightweight And Dynamic Convolutions Felix Wu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli
- Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection Guangxiang Zhao et al.
- A Pre-training Based Personalized Dialogue Generation Model With Persona-sparse Data Yinhe Zheng, Rongsheng Zhang, Xiaoxi Mao, Minlie Huang
- Cloze-driven Pretraining Of Self-attention Networks Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- MUSE: Parallel Multi-scale Attention For Sequence To Sequence Learning Guangxiang Zhao, Xu Sun, Jingjing Xu, Zhiyuan Zhang, Liangchen Luo
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Single Headed Attention RNN: Stop Thinking With Your Head Stephen Merity
- Language Modeling With Deep Transformers Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
- Frustratingly Easy Natural Question Answering Lin Pan et al.
- Modeling Graph Structure In Transformer For Better Amr-to-text Generation Jie Zhu et al.
- Modeling Recurrence For Transformer Jie Hao et al.
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Exbert: A Visual Analysis Tool To Explore Learned Representations In Transformers Models Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann
- Adding Interpretable Attention To Neural Translation Models Improves Word Alignment Thomas Zenkel, Joern Wuebker, John Denero
- Analyzing Multi-head Self-attention: Specialized Heads Do The Heavy Lifting, The Rest Can Be Pruned Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Interpreting And Improving Natural-language Processing (in Machines) With Natural Language-processing (in The Brain) Mariya Toneva, Leila Wehbe
- Stabilizing Transformers For Reinforcement Learning Emilio Parisotto et al.
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Bp-transformer: Modelling Long-range Context Via Binary Partitioning Zihao Ye, Qipeng Guo, Quan Gan, Xipeng Qiu, Zheng Zhang
- Do Attention Heads In BERT Track Syntactic Dependencies? Phu Mon Htut, Jason Phang, Shikha Bordia, Samuel R. Bowman
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- Attentive History Selection For Conversational Question Answering Chen Qu et al.
- Augmenting Self-attention With Persistent Memory Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Herve Jegou, Armand Joulin
- Adaptive Attention Span In Transformers Sainbayar Sukhbaatar, Edouard Grave, Piotr Bojanowski, Armand Joulin
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Fast Transformer Decoding: One Write-head Is All You Need Noam Shazeer
- Semantically Conditioned Dialog Response Generation Via Hierarchical Disentangled Self-attention Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, William Yang Wang
- Bridging The Gap For Tokenizer-free Language Models Dokook Choe, Rami Al-rfou, Mandy Guo, Heeyoung Lee, Noah Constant
- ACUTE-EVAL: Improved Dialogue Evaluation With Optimized Questions And Multi-turn Comparisons Margaret Li, Jason Weston, Stephen Roller
- Synchronous Bidirectional Inference For Neural Sequence Generation Jiajun Zhang, Long Zhou, Yang Zhao, Chengqing Zong
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- Improving Knowledge-aware Dialogue Generation Via Knowledge Base Question Answering Jian Wang et al.
- Text Infilling Wanrong Zhu, Zhiting Hu, Eric Xing
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- A Modular Task-oriented Dialogue System Using A Neural Mixture-of-experts Jiahuan Pei, Pengjie Ren, Maarten De Rijke
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- Attention-informed Mixed-language Training For Zero-shot Cross-lingual Task-oriented Dialogue Systems Zihan Liu, Genta Indra Winata, Zhaojiang Lin, Peng Xu, Pascale Fung
- Encoder-agnostic Adaptation For Conditional Language Generation Zachary M. Ziegler, Luke Melas-kyriazi, Sebastian Gehrmann, Alexander M. Rush
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- Sg-net: Syntax-guided Machine Reading Comprehension Zhuosheng Zhang et al.
- Low-rank Bottleneck In Multi-head Attention Models Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
- SEAL: Segment-wise Extractive-abstractive Long-form Text Summarization Yao Zhao, Mohammad Saleh, Peter J. Liu
- Linformer: Self-attention With Linear Complexity Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, Hao Ma
- Artificial Intelligence Versus Maya Angelou: Experimental Evidence That People Cannot Differentiate Ai-generated From Human-written Poetry Nils Köbis, Luca Mossink
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Enabling Language Models To Fill In The Blanks Chris Donahue, Mina Lee, Percy Liang
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Compressing Large-scale Transformer-based Models: A Case Study On BERT Prakhar Ganesh et al.
- KG-BART: Knowledge Graph-augmented BART For Generative Commonsense Reasoning Ye Liu, Yao Wan, Lifang He, Hao Peng, Philip S. Yu
- IART: Intent-aware Response Ranking With Transformers In Information-seeking Conversation Systems Liu Yang et al.
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- Visbert: Hidden-state Visualizations For Transformers Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Long Range Arena: A Benchmark For Efficient Transformers Yi Tay et al.
- PONE: A Novel Automatic Evaluation Metric For Open-domain Generative Dialogue Systems Tian Lan, Xian-ling Mao, Wei Wei, Xiaoyan Gao, Heyan Huang
- DUMA: Reading Comprehension With Transposition Thinking Pengfei Zhu, Hai Zhao, Xiaoguang Li
- Synthesizer: Rethinking Self-attention In Transformer Models Yi Tay et al.
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- GMAT: Global Memory Augmentation For Transformers Ankit Gupta, Jonathan Berant
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- Talking-heads Attention Noam Shazeer, Zhenzhong Lan, Youlong Cheng, Nan Ding, Le Hou
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- Natural Language Rationales With Full-stack Visual Reasoning: From Pixels To Semantic Frames To Commonsense Graphs Ana Marasović et al.
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- Ernie-doc: A Retrospective Long-document Modeling Transformer Siyu Ding et al.
- Are We Pretraining It Right? Digging Deeper Into Visio-linguistic Pretraining Amanpreet Singh, Vedanuj Goswami, Devi Parikh
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- Cocon: A Self-supervised Approach For Controlled Text Generation Alvin Chan, Yew-soon Ong, Bill Pung, Aston Zhang, Jie Fu
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Non-autoregressive Machine Translation With Disentangled Context Transformer Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Improving Natural Language Processing Tasks With Human Gaze-guided Neural Attention Ekta Sood, Simon Tannert, Philipp Mueller, Andreas Bulling
- Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training Weizhen Qi et al.
- The Cascade Transformer: An Application For Efficient Answer Sentence Selection Luca Soldaini, Alessandro Moschitti
- Automated Source Code Generation And Auto-completion Using Deep Learning: Comparing And Discussing Current Language-model-related Approaches Juan Cruz-benito, Sanjay Vishwakarma, Francisco Martin-fernandez, Ismael Faro
- Encoding Syntactic Knowledge In Transformer Encoder For Intent Detection And Slot Filling Jixuan Wang, Kai Wei, Martin Radfar, Weiwei Zhang, Clement Chung
- Look Before You Speak: Visually Contextualized Utterances Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid
- ETC: Encoding Long And Structured Inputs In Transformers Joshua Ainslie et al.
- Big Bird: Transformers For Longer Sequences Manzil Zaheer et al.
- Pchatbot: A Large-scale Dataset For Personalized Chatbot Hongjin Qian et al.
- Hard-coded Gaussian Attention For Neural Machine Translation Weiqiu You, Simeng Sun, Mohit Iyyer
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Turngpt: A Transformer-based Language Model For Predicting Turn-taking In Spoken Dialog Erik Ekstedt, Gabriel Skantze
- A Transformer-based Approach For Source Code Summarization Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-wei Chang
- A Controllable Model Of Grounded Response Generation Zeqiu Wu et al.
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- Contrastive Triple Extraction With Generative Transformer Hongbin Ye et al.
- Mobilebert: A Compact Task-agnostic BERT For Resource-limited Devices Zhiqing Sun et al.
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- DSTC8-AVSD: Multimodal Semantic Transformer Network With Retrieval Style Word Generator Hwanhee Lee et al.
- HAT: Hardware-aware Transformers For Efficient Natural Language Processing Hanrui Wang et al.
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- Minilmv2: Multi-head Self-attention Relation Distillation For Compressing Pretrained Transformers Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- G-transformer For Document-level Machine Translation Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo
- Improving Stack Overflow Question Title Generation With Copying Enhanced Codebert Model And Bi-modal Information Fengji Zhang et al.
- Mention Memory: Incorporating Textual Knowledge Into Transformers Through Entity Mention Attention Michiel De Jong, Yury Zemlyanskiy, Nicholas Fitzgerald, Fei Sha, William Cohen
- Generic Attention-model Explainability For Interpreting Bi-modal And Encoder-decoder Transformers Hila Chefer, Shir Gur, Lior Wolf
- Explaining Documents' Relevance To Search Queries Razieh Rahimi, Youngwoo Kim, Hamed Zamani, James Allan
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- Less Is More: Pre-train A Strong Text Encoder For Dense Retrieval Using A Weak Decoder Shuqi Lu et al.
- Focused Attention Improves Document-grounded Generation Shrimai Prabhumoye, Kazuma Hashimoto, Yingbo Zhou, Alan W Black, Ruslan Salakhutdinov
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- Pretrained Transformers As Universal Computation Engines Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
- Using Prior Knowledge To Guide Bert's Attention In Semantic Textual Matching Tasks Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
- Conversational Question Answering Over Knowledge Graphs With Transformer And Graph Attention Networks Endri Kacupaj et al.
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Text Compression-aided Transformer Encoding Zuchao Li et al.
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- Compressing Visual-linguistic Model Via Knowledge Distillation Zhiyuan Fang et al.
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- Long-span Summarization Via Local Attention And Content Selection Potsawee Manakul, Mark J. F. Gales
- Multimodal Transformer With Variable-length Memory For Vision-and-language Navigation Chuang Lin et al.
- Fastformer: Additive Attention Can Be All You Need Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang, Xing Xie
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-image Pre-training Paradigm Yangguang Li et al.
- N\"UWA: Visual Synthesis Pre-training For Neural Visual World Creation Chenfei Wu et al.
- Non-invasive Self-attention For Side Information Fusion In Sequential Recommendation Chang Liu et al.
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- Vision Guided Generative Pre-trained Language Models For Multimodal Abstractive Summarization Tiezheng Yu, Wenliang Dai, Zihan Liu, Pascale Fung
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- Towards Few-shot Fact-checking Via Perplexity Nayeon Lee, Yejin Bang, Andrea Madotto, Madian Khabsa, Pascale Fung
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Human Parity On Commonsenseqa: Augmenting Self-attention With External Attention Yichong Xu et al.
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- Worst Of Both Worlds: Biases Compound In Pre-trained Vision-and-language Models Tejas Srinivasan, Yonatan Bisk
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- Towards Retrieval-based Conversational Recommendation Ahtsham Manzoor, Dietmar Jannach
- Visqa: X-raying Vision And Language Reasoning In Transformers Theo Jaunet et al.
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- Learned Token Pruning For Transformers Sehoon Kim et al.
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- Condenser: A Pre-training Architecture For Dense Retrieval Luyu Gao, Jamie Callan
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- What Makes Good In-context Examples For GPT-\(3\)? Jiachang Liu et al.
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Code Structure Guided Transformer For Source Code Summarization Shuzheng Gao et al.
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Longt5: Efficient Text-to-text Transformer For Long Sequences Mandy Guo et al.
- Hurdles To Progress In Long-form Question Answering Kalpesh Krishna, Aurko Roy, Mohit Iyyer
- Dialogue History Matters! Personalized Response Selectionin Multi-turn Retrieval-based Chatbots Juntao Li et al.
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- MATE: Multi-view Attention For Table Transformer Efficiency Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, William W. Cohen
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- A Survey On Retrieval-augmented Text Generation Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
- A Length-extrapolatable Transformer Yutao Sun et al.
- Less Is More: Learning To Refine Dialogue History For Personalized Dialogue Generation Hanxun Zhong, Zhicheng Dou, Yutao Zhu, Hongjin Qian, Ji-rong Wen
- Hitskt: A Hierarchical Transformer Model For Session-aware Knowledge Tracing Fucai Ke et al.
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Chatgpt Makes Medicine Easy To Swallow: An Exploratory Case Study On Simplified Radiology Reports Katharina Jeblick et al.
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- RASAT: Integrating Relational Structures Into Pretrained Seq2seq Model For Text-to-sql Jiexing Qi et al.
- Lilt: A Simple Yet Effective Language-independent Layout Transformer For Structured Document Understanding Jiapeng Wang, Lianwen Jin, Kai Ding
- Knowledge Prompting In Pre-trained Language Model For Natural Language Understanding Jianing Wang et al.
- Coca: Contrastive Captioners Are Image-text Foundation Models Jiahui Yu et al.
- Phenaki: Variable Length Video Generation From Open Domain Textual Description Ruben Villegas et al.
- Structured Pruning Learns Compact And Accurate Models Mengzhou Xia, Zexuan Zhong, Danqi Chen
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- Hyperprompt: Prompt-based Task-conditioning Of Transformers Yun He et al.
- Improving Passage Retrieval With Zero-shot Question Generation Devendra Singh Sachan et al.
- Protoclip: Prototypical Contrastive Language Image Pretraining Delong Chen et al.
- Block-recurrent Transformers Delesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur
- Future Transformer For Long-term Action Anticipation Dayoung Gong, Joonseok Lee, Manjin Kim, Seong Jong Ha, Minsu Cho
- Adaprompt: Adaptive Model Training For Prompt-based NLP Yulong Chen et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- Mplug: Effective And Efficient Vision-language Learning By Cross-modal Skip-connections Chenliang Li et al.
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- In-context Learning And Induction Heads Catherine Olsson et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Recurrent Memory Transformer Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Clinical-longformer And Clinical-bigbird: Transformers For Long Clinical Sequences Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Prompt-to-prompt Image Editing With Cross Attention Control Amir Hertz et al.
- Dual Modality Prompt Tuning For Vision-language Pre-trained Model Yinghui Xing et al.
- ATTEMPT: Parameter-efficient Multi-task Tuning Via Attentional Mixtures Of Soft Prompts Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Transformer Language Models Without Positional Encodings Still Learn Positional Information Adi Haviv, Ori Ram, Ofir Press, Peter Izsak, Omer Levy
- Generative Spoken Dialogue Language Modeling Tu Anh Nguyen et al.
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- Llm.int8(): 8-bit Matrix Multiplication For Transformers At Scale Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Make-a-video: Text-to-video Generation Without Text-video Data Uriel Singer et al.
- CTRAN: Cnn-transformer-based Network For Natural Language Understanding Mehrdad Rafiepour, Javad Salimi Sartakhti
- A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets Md Tahmid Rahman Laskar et al.
- Applenet: Visual Attention Parameterized Prompt Learning For Few-shot Remote Sensing Image Generalization Using CLIP Mainak Singha, Ankit Jha, Bhupendra Solanki, Shirsha Bose, Biplab Banerjee
- Give Us The Facts: Enhancing Large Language Models With Knowledge Graphs For Fact-aware Language Modeling Linyao Yang, Hongyang Chen, Zhao Li, Xiao Ding, Xindong Wu
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Surgicalgpt: End-to-end Language-vision GPT For Visual Question Answering In Surgery Lalithkumar Seenivasan, Mobarakol Islam, Gokul Kannan, Hongliang Ren
- Inference-time Intervention: Eliciting Truthful Answers From A Language Model Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Recommendation As Instruction Following: A Large Language Model Empowered Recommendation Approach Junjie Zhang et al.
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Jatmo: Prompt Injection Defense By Task-specific Finetuning Julien Piet et al.
- GQA: Training Generalized Multi-query Transformer Models From Multi-head Checkpoints Joshua Ainslie et al.
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Graphix-t5: Mixing Pre-trained Transformers With Graph-aware Layers For Text-to-sql Parsing Jinyang Li et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- Longnet: Scaling Transformers To 1,000,000,000 Tokens Jiayu Ding et al.
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- Imagebind-llm: Multi-modality Instruction Tuning Jiaming Han et al.
- Onellm: One Framework To Align All Modalities With Language Jiaming Han et al.
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Leveraging Large Language Models For Sequential Recommendation Jesse Harte et al.
- Chatgpt In The Classroom: An Analysis Of Its Strengths And Weaknesses For Solving Undergraduate Computer Science Questions Ishika Joshi et al.
- Factuality Challenges In The Era Of Large Language Models Isabelle Augenstein et al.
- "it's Not Like Jarvis, But It's Pretty Close!" -- Examining Chatgpt's Usage Among Undergraduate Students In Computer Science Ishika Joshi, Ritvik Budhiraja, Harshal D Akolekar, Jagat Sesh Challa, Dhruv Kumar
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Cognitive Mirage: A Review Of Hallucinations In Large Language Models Hongbin Ye, Tong Liu, Aijia Zhang, Wei Hua, Weiqiang Jia
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- Let's Have A Chat! A Conversation With Chatgpt: Technology, Applications, And Limitations Sakib Shahriar, Kadhim Hayawi
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Translating Radiology Reports Into Plain Language Using Chatgpt And GPT-4 With Prompt Learning: Promising Results, Limitations, And Potential Qing Lyu et al.
- Grounded Text-to-image Synthesis With Attention Refocusing Quynh Phung, Songwei Ge, Jia-bin Huang
- Starcoder: May The Source Be With You! Raymond Li et al.
- Label Supervised Llama Finetuning Zongxi Li et al.
- Hyena Hierarchy: Towards Larger Convolutional Language Models Michael Poli et al.
- Large Language Model Alignment: A Survey Tianhao Shen et al.
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- Expressive Text-to-image Generation With Rich Text Songwei Ge, Taesung Park, Jun-yan Zhu, Jia-bin Huang
- Llm-empowered Chatbots For Psychiatrist And Patient Simulation: Application And Evaluation Siyuan Chen et al.
- Mind Meets Machine: Unravelling Gpt-4's Cognitive Psychology Sifatkaur Dhingra, Manmeet Singh, Vaisakh Sb, Neetiraj Malviya, Sukhpal Singh Gill
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- From Words To Watts: Benchmarking The Energy Costs Of Large Language Model Inference Siddharth Samsi et al.
- Chatgpt Beyond English: Towards A Comprehensive Evaluation Of Large Language Models In Multilingual Learning Viet Dac Lai et al.
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Trusting Your Evidence: Hallucinate Less With Context-aware Decoding Weijia Shi et al.
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- A Preliminary Evaluation Of Chatgpt For Zero-shot Dialogue Understanding Wenbo Pan, Qiguang Chen, Xiao Xu, Wanxiang Che, Libo Qin
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Chatgpt Or Grammarly? Evaluating Chatgpt On Grammatical Error Correction Benchmark Haoran Wu, Wenxuan Wang, Yuxuan Wan, Wenxiang Jiao, Michael Lyu
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Text Matching Improves Sequential Recommendation By Reducing Popularity Biases Zhenghao Liu et al.
- Exploring Human-like Translation Strategy With Large Language Models Zhiwei He et al.
- Empower Large Language Model To Perform Better On Industrial Domain-specific Question Answering Fangkai Yang et al.
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- AI And The FCI: Can Chatgpt Project An Understanding Of Introductory Physics? Colin G. West
- Is Chatgpt A General-purpose Natural Language Processing Task Solver? Chengwei Qin et al.
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- How Close Is Chatgpt To Human Experts? Comparison Corpus, Evaluation, And Detection Biyang Guo et al.
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- The Impact Of Positional Encoding On Length Generalization In Transformers Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
- Large Language Models For Telecom: Forthcoming Impact On The Industry Ali Maatouk, Nicola Piovesan, Fadhel Ayed, Antonio De Domenico, Merouane Debbah
- A Categorical Archive Of Chatgpt Failures Ali Borji
- What Does CLIP Know About A Red Circle? Visual Prompt Engineering For Vlms Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- Mistral 7B Albert Q. Jiang et al.
- Enhancing Retrieval-augmented Large Language Models With Iterative Retrieval-generation Synergy Zhihong Shao et al.
- Flexgen: High-throughput Generative Inference Of Large Language Models With A Single GPU Ying Sheng et al.
- A Comparative Study Of Pretrained Language Models For Long Clinical Text Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- A Comprehensive Survey Of Ai-generated Content (AIGC): A History Of Generative AI From GAN To Chatgpt Yihan Cao et al.
- Key-locked Rank One Editing For Text-to-image Personalization Yoad Tewel, Rinon Gal, Gal Chechik, Yuval Atzmon
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- Is Chatgpt A Good Sentiment Analyzer? A Preliminary Study Zengzhi Wang et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Transformers Are Ssms: Generalized Models And Efficient Algorithms Through Structured State Space Duality Tri Dao, Albert Gu
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- Xlstm: Extended Long Short-term Memory Maximilian Beck et al.
- Linrec: Linear Attention Mechanism For Long-term Sequential Recommender Systems Langming Liu et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Building Better AI Agents: A Provocation On The Utilisation Of Persona In Llm-based Conversational Agents Guangzhi Sun, Xiao Zhan, Jose Such
- Gemma 2: Improving Open Language Models At A Practical Size Gemma Team et al.
- The Power Of Noise: Redefining Retrieval For RAG Systems Florin Cuconasu et al.
- Deepseek-v2: A Strong, Economical, And Efficient Mixture-of-experts Language Model Deepseek-ai et al.
- A Survey On Lora Of Large Language Models Yuren Mao et al.
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
🏷 BERT
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- Can You Tell Me How To Get Past Sesame Street? Sentence-level Pretraining Beyond Language Modeling Alex Wang et al.
- Sdnet: Contextualized Attention-based Deep Network For Conversational Question Answering Chenguang Zhu, Michael Zeng, Xuedong Huang
- Sentence Encoders On Stilts: Supplementary Training On Intermediate Labeled-data Tasks Jason Phang, Thibault Févry, Samuel R. Bowman
- "bilingual Expert" Can Find Translation Errors Kai Fan et al.
- BERT: Pre-training Of Deep Bidirectional Transformers For Language Understanding Jacob Devlin, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Is Multilingual BERT Fluent In Language Generation? Samuel Rönnqvist, Jenna Kanerva, Tapio Salakoski, Filip Ginter
- Multi-passage BERT: A Globally Normalized BERT Model For Open-domain Question Answering Zhiguo Wang, Patrick Ng, Xiaofei Ma, Ramesh Nallapati, Bing Xiang
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Bert4rec: Sequential Recommendation With Bidirectional Encoder Representations From Transformer Fei Sun et al.
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Revealing The Dark Secrets Of BERT Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky
- Align, Mask And Select: A Simple Method For Incorporating Commonsense Knowledge Into Language Representation Models Zhi-xiu Ye, Qian Chen, Wen Wang, Zhen-hua Ling
- Olmpics -- On What Language Model Pre-training Captures Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant
- Multiqa: An Empirical Investigation Of Generalization And Transfer In Reading Comprehension Alon Talmor, Jonathan Berant
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Understanding The Behaviors Of BERT In Ranking Yifan Qiao, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Answering Complex Open-domain Questions Through Iterative Query Generation Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, Christopher D. Manning
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Pretrained Language Models For Sequential Sentence Classification Arman Cohan, Iz Beltagy, Daniel King, Bhavana Dalvi, Daniel S. Weld
- Pretrained Language Models For Document-level Neural Machine Translation Liangyou Li, Xin Jiang, Qun Liu
- Language Models As Knowledge Bases? Fabio Petroni et al.
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Camembert: A Tasty French Language Model Louis Martin et al.
- Encode, Tag, Realize: High-precision Text Editing Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- UER: An Open-source Toolkit For Pre-training Models Zhe Zhao et al.
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- BERT Has A Mouth, And It Must Speak: BERT As A Markov Random Field Language Model Alex Wang, Kyunghyun Cho
- Cloze-driven Pretraining Of Self-attention Networks Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- Linking Artificial And Human Neural Representations Of Language Jon Gauthier, Roger Levy
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Frustratingly Easy Natural Question Answering Lin Pan et al.
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Myers-briggs Personality Classification And Personality-specific Language Generation Using Pre-trained Language Models Sedrick Scott Keh, I-tsun Cheng
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- An Effective Domain Adaptive Post-training Method For BERT In Response Selection Taesun Whang et al.
- The Bottom-up Evolution Of Representations In The Transformer: A Study With Machine Translation And Language Modeling Objectives Elena Voita, Rico Sennrich, Ivan Titov
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- What Would Elsa Do? Freezing Layers During Transformer Fine-tuning Jaejun Lee, Raphael Tang, Jimmy Lin
- Exbert: A Visual Analysis Tool To Explore Learned Representations In Transformers Models Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann
- VL-BERT: Pre-training Of Generic Visual-linguistic Representations Weijie Su et al.
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- A Simple But Effective Method To Incorporate Multi-turn Context With BERT For Conversational Machine Comprehension Yasuhito Ohsugi, Itsumi Saito, Kyosuke Nishida, Hisako Asano, Junji Tomita
- Interpreting And Improving Natural-language Processing (in Machines) With Natural Language-processing (in The Brain) Mariya Toneva, Leila Wehbe
- On The Use Of BERT For Neural Machine Translation Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Multi-hop Question Answering Via Reasoning Chains Jifan Chen, Shih-ting Lin, Greg Durrett
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Berts Of A Feather Do Not Generalize Together: Large Variability In Generalization Across Models With Similar Test Set Performance R. Thomas Mccoy, Junghyun Min, Tal Linzen
- Semantics-aware BERT For Language Understanding Zhuosheng Zhang et al.
- Data Augmentation For BERT Fine-tuning In Open-domain Question Answering Wei Yang et al.
- Inducing Brain-relevant Bias In Natural Language Processing Models Dan Schwartz, Mariya Toneva, Leila Wehbe
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Parameter-efficient Transfer Learning For NLP Neil Houlsby et al.
- Automatic Spanish Translation Of The Squad Dataset For Multilingual Question Answering Casimiro Pio Carrino, Marta R. Costa-jussà, José A. R. Fonollosa
- Visualizing And Understanding The Effectiveness Of BERT Yaru Hao, Li Dong, Furu Wei, Ke Xu
- Do Attention Heads In BERT Track Syntactic Dependencies? Phu Mon Htut, Jason Phang, Shikha Bordia, Samuel R. Bowman
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Bertscore: Evaluating Text Generation With BERT Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, Yoav Artzi
- Attentive History Selection For Conversational Question Answering Chen Qu et al.
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Learning And Evaluating Contextual Embedding Of Source Code Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
- Linguistic Knowledge And Transferability Of Contextual Representations Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Adapting And Evaluating A Deep Learning Language Model For Clinical Why-question Answering Andrew Wen, Mohamed Y. Elwazir, Sungrim Moon, Jungwei Fan
- Roberta: A Robustly Optimized BERT Pretraining Approach Yinhan Liu et al.
- Synthetic QA Corpora Generation With Roundtrip Consistency Chris Alberti, Daniel Andor, Emily Pitler, Jacob Devlin, Michael Collins
- Microsoft Translator At WMT 2019: Towards Large-scale Document-level Neural Machine Translation Marcin Junczys-dowmunt
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- ALBERT: A Lite BERT For Self-supervised Learning Of Language Representations Zhenzhong Lan et al.
- Beto, Bentz, Becas: The Surprising Cross-lingual Effectiveness Of BERT Shijie Wu, Mark Dredze
- What Does BERT Learn From Multiple-choice Reading Comprehension Datasets? Chenglei Si, Shuohang Wang, Min-yen Kan, Jing Jiang
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Transfer Fine-tuning: A BERT Case Study Yuki Arase, Junichi Tsujii
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- Pretrained Encyclopedia: Weakly Supervised Knowledge-pretrained Language Model Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- Harnessing Evolution Of Multi-turn Conversations For Effective Answer Retrieval Mohammad Aliannejadi, Manajit Chakraborty, Esteban Andrés Ríssola, Fabio Crestani
- Sg-net: Syntax-guided Machine Reading Comprehension Zhuosheng Zhang et al.
- Colake: Contextualized Language And Knowledge Embedding Tianxiang Sun et al.
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- XGLUE: A New Benchmark Dataset For Cross-lingual Pre-training, Understanding And Generation Yaobo Liang et al.
- To Pretrain Or Not To Pretrain: Examining The Benefits Of Pretraining On Resource Rich Tasks Sinong Wang, Madian Khabsa, Hao Ma
- Bert-of-theseus: Compressing BERT By Progressive Module Replacing Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- Fine-tuning Pretrained Language Models: Weight Initializations, Data Orders, And Early Stopping Jesse Dodge et al.
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- Phobert: Pre-trained Language Models For Vietnamese Dat Quoc Nguyen, Anh Tuan Nguyen
- Injecting Numerical Reasoning Skills Into Language Models Mor Geva, Ankit Gupta, Jonathan Berant
- Intermediate-task Transfer Learning With Pretrained Models For Natural Language Understanding: When And Why Does It Work? Yada Pruksachatkun et al.
- Pretrained Transformers For Simple Question Answering Over Knowledge Graphs D. Lukovnikov, A. Fischer, J. Lehmann
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- CG-BERT: Conditional Text Generation With BERT For Generalized Few-shot Intent Detection Congying Xia, Chenwei Zhang, Hoang Nguyen, Jiawei Zhang, Philip Yu
- How Effective Is Task-agnostic Data Augmentation For Pretrained Transformers? Shayne Longpre, Yu Wang, Christopher Dubois
- Deebert: Dynamic Early Exiting For Accelerating BERT Inference Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin
- Speaker-aware BERT For Multi-turn Response Selection In Retrieval-based Chatbots Jia-chen Gu et al.
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies For Multi-turn Response Selection Taesun Whang et al.
- Masking As An Efficient Alternative To Finetuning For Pretrained Language Models Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze
- Exploring Fine-tuning Techniques For Pre-trained Cross-lingual Models Via Continual Learning Zihan Liu, Genta Indra Winata, Andrea Madotto, Pascale Fung
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Compressing Large-scale Transformer-based Models: A Case Study On BERT Prakhar Ganesh et al.
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Contextualized Perturbation For Textual Adversarial Attack Dianqi Li et al.
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- Coreferential Reasoning Learning For Language Representation Deming Ye et al.
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- Inducing Language-agnostic Multilingual Representations Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- GRUEN For Evaluating Linguistic Quality Of Generated Text Wanzheng Zhu, Suma Bhat
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- Visbert: Hidden-state Visualizations For Transformers Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- When Being Unseen From Mbert Is Just The Beginning: Handling New Languages With Multilingual Language Models Benjamin Muller, Antonis Anastasopoulos, Benoît Sagot, Djamé Seddah
- Training Large Neural Networks With Constant Memory Using A New Execution Algorithm Bharadwaj Pudipeddi, Maral Mesmakhosroshahi, Jinwen Xi, Sujeeth Bharadwaj
- Tabert: Pretraining For Joint Understanding Of Textual And Tabular Data Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- When Do You Need Billions Of Words Of Pretraining Data? Yian Zhang, Alex Warstadt, Haau-sing Li, Samuel R. Bowman
- Bert-hlstms: BERT And Hierarchical Lstms For Visual Storytelling Jing Su, Qingyun Dai, Frank Guerin, Mian Zhou
- Beyond I.I.D.: Three Levels Of Generalization For Question Answering On Knowledge Bases Yu Gu et al.
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- A Comparison Of LSTM And BERT For Small Corpus Aysu Ezen-can
- GMAT: Global Memory Augmentation For Transformers Ankit Gupta, Jonathan Berant
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Syntactic Data Augmentation Increases Robustness To Inference Heuristics Junghyun Min, R. Thomas Mccoy, Dipanjan Das, Emily Pitler, Tal Linzen
- What Happens To BERT Embeddings During Fine-tuning? Amil Merchant, Elahe Rahimtoroghi, Ellie Pavlick, Ian Tenney
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- Relevance-guided Supervision For Openqa With Colbert Omar Khattab, Christopher Potts, Matei Zaharia
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Recall And Learn: Fine-tuning Deep Pretrained Language Models With Less Forgetting Sanyuan Chen et al.
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- TRANS-BLSTM: Transformer With Bidirectional LSTM For Language Understanding Zhiheng Huang, Peng Xu, Davis Liang, Ajay Mishra, Bing Xiang
- Fine-tuning BERT For Schema-guided Zero-shot Dialogue State Tracking Yu-ping Ruan, Zhen-hua Ling, Jia-chen Gu, Quan Liu
- GREEK-BERT: The Greeks Visiting Sesame Street John Koutsikakis, Ilias Chalkidis, Prodromos Malakasiotis, Ion Androutsopoulos
- Contrastive Code Representation Learning Paras Jain et al.
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- Logic-guided Data Augmentation And Regularization For Consistent Question Answering Akari Asai, Hannaneh Hajishirzi
- Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue Shikib Mehri, Mihail Eric, Dilek Hakkani-tur
- Big Bird: Transformers For Longer Sequences Manzil Zaheer et al.
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Cosda-ml: Multi-lingual Code-switching Data Augmentation For Zero-shot Cross-lingual NLP Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che
- Probing Pretrained Language Models For Lexical Semantics Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
- Human Instruction-following With Deep Reinforcement Learning Via Transfer-learning From Text Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley
- Pre-training Via Paraphrasing Mike Lewis et al.
- How Context Affects Language Models' Factual Predictions Fabio Petroni et al.
- How Fine Can Fine-tuning Be? Learning Efficient Language Models Evani Radiya-dixit, Xin Wang
- Calibration Of Pre-trained Transformers Shrey Desai, Greg Durrett
- Robust Encodings: A Framework For Combating Adversarial Typos Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- CERT: Contrastive Self-supervised Learning For Language Understanding Hongchao Fang, Sicheng Wang, Meng Zhou, Jiayuan Ding, Pengtao Xie
- On Learning Universal Representations Across Languages Xiangpeng Wei et al.
- Length-adaptive Transformer: Train Once With Length Drop, Use Anytime With Search Gyuwan Kim, Kyunghyun Cho
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- Residual Energy-based Models For Text Generation Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'aurelio Ranzato
- Mixup-transformer: Dynamic Data Augmentation For NLP Tasks Lichao Sun et al.
- Mobilebert: A Compact Task-agnostic BERT For Resource-limited Devices Zhiqing Sun et al.
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- Document Ranking With A Pretrained Sequence-to-sequence Model Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- On The Effect Of Dropping Layers Of Pre-trained Transformer Models Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- BERT Based Multilingual Machine Comprehension In English And Hindi Somil Gupta, Nilesh Khade
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- TAP: Text-aware Pre-training For Text-vqa And Text-caption Zhengyuan Yang et al.
- Trojaning Language Models For Fun And Profit Xinyang Zhang, Zheng Zhang, Shouling Ji, Ting Wang
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- An Empirical Study On Robustness To Spurious Correlations Using Pre-trained Language Models Lifu Tu, Garima Lalwani, Spandana Gella, He He
- Data Augmentation Using Pre-trained Transformer Models Varun Kumar, Ashutosh Choudhary, Eunah Cho
- BERT Loses Patience: Fast And Robust Inference With Early Exit Wangchunshu Zhou et al.
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- Minilmv2: Multi-head Self-attention Relation Distillation For Compressing Pretrained Transformers Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- Improving Stack Overflow Question Title Generation With Copying Enhanced Codebert Model And Bi-modal Information Fengji Zhang et al.
- Robeczech: Czech Roberta, A Monolingual Contextualized Language Representation Model Milan Straka, Jakub Náplava, Jana Straková, David Samuel
- Evaluating The Robustness Of Neural Language Models To Input Perturbations Milad Moradi, Matthias Samwald
- Wangchanberta: Pretraining Transformer-based Thai Language Models Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai, Sarana Nutanong
- MWP-BERT: Numeracy-augmented Pre-training For Math Word Problem Solving Zhenwen Liang et al.
- Bob: BERT Over BERT For Training Persona-based Dialogue Models From Limited Personalized Data Haoyu Song, Yan Wang, Kaiyan Zhang, Wei-nan Zhang, Ting Liu
- Personalized Transformer For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- BERT, Mbert, Or Bibert? A Study On Contextualized Embeddings For Neural Machine Translation Haoran Xu, Benjamin Van Durme, Kenton Murray
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Vision-and-language Or Vision-for-language? On Cross-modal Influence In Multimodal Transformers Stella Frank, Emanuele Bugliarello, Desmond Elliott
- Revisiting The Primacy Of English In Zero-shot Cross-lingual Transfer Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-wei Chang, Kristina Toutanova
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- Using Prior Knowledge To Guide Bert's Attention In Semantic Textual Matching Tasks Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
- GPT-3 Models Are Poor Few-shot Learners In The Biomedical Domain Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- How Should Pre-trained Language Models Be Fine-tuned Towards Adversarial Robustness? Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- Self-guided Contrastive Learning For BERT Sentence Representations Taeuk Kim, Kang Min Yoo, Sang-goo Lee
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- Differentially Private Fine-tuning Of Language Models Da Yu et al.
- Automated Quality Assessment Of Cognitive Behavioral Therapy Sessions Through Highly Contextualized Language Representations Nikolaos Flemotomos et al.
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- Non-invasive Self-attention For Side Information Fusion In Sequential Recommendation Chang Liu et al.
- Climatebert: A Pretrained Language Model For Climate-related Text Nicolas Webersinke, Mathias Kraus, Julia Anna Bingler, Markus Leippold
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- Maria: Spanish Language Models Asier Gutiérrez-fandiño et al.
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Multilingual LAMA: Investigating Knowledge In Multilingual Pretrained Language Models Nora Kassner, Philipp Dufter, Hinrich Schütze
- Multilingual Language Models Predict Human Reading Behavior Nora Hollenstein, Federico Pirovano, Ce Zhang, Lena Jäger, Lisa Beinborn
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- Predicting The Performance Of Multilingual NLP Models Anirudh Srinivasan et al.
- What Do Pre-trained Code Models Know About Code? Anjan Karmakar, Romain Robbes
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- GLM: General Language Model Pretraining With Autoregressive Blank Infilling Zhengxiao Du et al.
- Worst Of Both Worlds: Biases Compound In Pre-trained Vision-and-language Models Tejas Srinivasan, Yonatan Bisk
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- Bertese: Learning To Speak To BERT Adi Haviv, Jonathan Berant, Amir Globerson
- Commitbert: Commit Message Generation Using Pre-trained Programming Language Model Tae-hwan Jung
- Embodied BERT: A Transformer Model For Embodied, Language-guided Visual Task Completion Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Open Domain Question Answering Over Tables Via Dense Retrieval Jonathan Herzig, Thomas Müller, Syrine Krichene, Julian Martin Eisenschlos
- Scifive: A Text-to-text Transformer Model For Biomedical Literature Long N. Phan et al.
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- Using Adversarial Attacks To Reveal The Statistical Bias In Machine Reading Comprehension Models Jieyu Lin, Jiajie Zou, Nai Ding
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- Rethink Training Of BERT Rerankers In Multi-stage Retrieval Pipeline Luyu Gao, Zhuyun Dai, Jamie Callan
- I-BERT: Integer-only BERT Quantization Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
- Sentence-t5: Scalable Sentence Encoders From Pre-trained Text-to-text Models Jianmo Ni et al.
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- Codexglue: A Machine Learning Benchmark Dataset For Code Understanding And Generation Shuai Lu et al.
- Variational Information Bottleneck For Effective Low-resource Fine-tuning Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Robertuito: A Pre-trained Language Model For Social Media Text In Spanish Juan Manuel Pérez, Damián A. Furman, Laura Alonso Alemany, Franco Luque
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- A Comparative Study Of Transformer-based Language Models On Extractive Question Answering Kate Pearce, Tiffany Zhan, Aneesh Komanduri, Justin Zhan
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- On The Paradox Of Learning To Reason From Data Honghua Zhang, Liunian Harold Li, Tao Meng, Kai-wei Chang, Guy Van Den Broeck
- Contrastive Learning With Bidirectional Transformers For Sequential Recommendation Hanwen Du et al.
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Minicons: Enabling Flexible Behavioral And Representational Analyses Of Transformer Language Models Kanishka Misra
- VLC-BERT: Visual Question Answering With Contextualized Commonsense Knowledge Sahithya Ravi, Aditya Chinchure, Leonid Sigal, Renjie Liao, Vered Shwartz
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- Adapting Pre-trained Language Models To African Languages Via Multilingual Adaptive Fine-tuning Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow
- BERTIN: Efficient Pre-training Of A Spanish Language Model Using Perplexity Sampling Javier De La Rosa et al.
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Memorization Without Overfitting: Analyzing The Training Dynamics Of Large Language Models Kushal Tirumala, Aram H. Markosyan, Luke Zettlemoyer, Armen Aghajanyan
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- Ernie-search: Bridging Cross-encoder With Dual-encoder Via Self On-the-fly Distillation For Dense Passage Retrieval Yuxiang Lu et al.
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- LERT: A Linguistically-motivated Pre-trained Language Model Yiming Cui, Wanxiang Che, Shijin Wang, Ting Liu
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Clinical-longformer And Clinical-bigbird: Transformers For Long Clinical Sequences Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- Empowering Language Models With Knowledge Graph Reasoning For Question Answering Ziniu Hu et al.
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- Arabart: A Pretrained Arabic Sequence-to-sequence Model For Abstractive Summarization Moussa Kamal Eddine, Nadi Tomeh, Nizar Habash, Joseph Le Roux, Michalis Vazirgiannis
- Dylora: Parameter Efficient Tuning Of Pre-trained Models Using Dynamic Search-free Low-rank Adaptation Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, Ali Ghodsi
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- CTRAN: Cnn-transformer-based Network For Natural Language Understanding Mehrdad Rafiepour, Javad Salimi Sartakhti
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Leveraging Large Language Models For Sequential Recommendation Jesse Harte et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- GPT-RE: In-context Learning For Relation Extraction Using Large Language Models Zhen Wan et al.
- Prompting For Multimodal Hateful Meme Classification Rui Cao, Roy Ka-wei Lee, Wen-haw Chong, Jing Jiang
- Retrieval-augmented Image Captioning Rita Ramos, Desmond Elliott, Bruno Martins
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Label Supervised Llama Finetuning Zongxi Li et al.
- Chatgpt: Beginning Of An End Of Manual Linguistic Data Annotation? Use Case Of Automatic Genre Identification Taja Kuzman, Igor Mozetič, Nikola Ljubešić
- On The Possibilities Of Ai-generated Text Detection Souradip Chakraborty et al.
- Towards Efficient Fine-tuning Of Pre-trained Code Models: An Experimental Study And Beyond Ensheng Shi et al.
- Fine-tuning Chatgpt For Automatic Scoring Ehsan Latif, Xiaoming Zhai
- Llm-powered Data Augmentation For Enhanced Cross-lingual Performance Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- Large Language Models In The Workplace: A Case Study On Prompt Engineering For Job Type Classification Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, Thomas Brightwell
- Bad Actor, Good Advisor: Exploring The Role Of Large Language Models In Fake News Detection Beizhe Hu et al.
- A Comparative Study Of Pretrained Language Models For Long Clinical Text Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Low-rank Adaptation Of Large Language Model Rescoring For Parameter-efficient Speech Recognition Yu Yu et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Is Chatgpt A Good Sentiment Analyzer? A Preliminary Study Zengzhi Wang et al.
- From Text To Transformation: A Comprehensive Review Of Large Language Models' Versatility Pravneet Kaur et al.
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- Quality Of Answers Of Generative Large Language Models Vs Peer Patients For Interpreting Lab Test Results For Lay Patients: Evaluation Study Zhe He et al.
- Findings Of The Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Alex Warstadt et al.
🏷 Bias Mitigation
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Improving Gender Fairness Of Pre-trained Language Models Without Catastrophic Forgetting Zahra Fatemi, Chen Xing, Wenhao Liu, Caiming Xiong
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- Perturbation Augmentation For Fairer NLP Rebecca Qian et al.
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Quantifying Memorization Across Neural Language Models Nicholas Carlini et al.
- Holistic Evaluation Of Language Models Percy Liang et al.
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Language Model Tokenizers Introduce Unfairness Between Languages Aleksandar Petrov, Emanuele La Malfa, Philip H. S. Torr, Adel Bibi
- "kelly Is A Warm Person, Joseph Is A Role Model": Gender Biases In Llm-generated Reference Letters Yixin Wan et al.
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Mapping The Ethics Of Generative AI: A Comprehensive Scoping Review Thilo Hagendorff
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
🏷 COLING
🏷 Dataset
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- Llm-assisted Content Analysis: Using Large Language Models To Support Deductive Coding Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim
- Observations On Llms For Telecom Domain: Capabilities And Limitations Sumit Soman, Ranjani H G
- Improved Baselines With Visual Instruction Tuning Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
🏷 Distillation
- Sequence-level Knowledge Distillation Yoon Kim, Alexander M. Rush
- Data Distillation For Controlling Specificity In Dialogue Generation Jiwei Li, Will Monroe, Dan Jurafsky
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Approximating Interactive Human Evaluation With Self-play For Open-domain Dialog Systems Asma Ghandeharioun et al.
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Bert-of-theseus: Compressing BERT By Progressive Module Replacing Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Knowledge Distillation For Improved Accuracy In Spoken Question Answering Chenyu You, Nuo Chen, Yuexian Zou
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- Mixkd: Towards Efficient Distillation Of Large-scale Language Models Kevin J Liang et al.
- On The Effect Of Dropping Layers Of Pre-trained Transformer Models Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- Minilmv2: Multi-head Self-attention Relation Distillation For Compressing Pretrained Transformers Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- Compressing Visual-linguistic Model Via Knowledge Distillation Zhiyuan Fang et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- CLIP-TD: CLIP Targeted Distillation For Vision-language Tasks Zhecan Wang et al.
- What Do Llms Know About Financial Markets? A Case Study On Reddit Market Sentiment Analysis Xiang Deng, Vasilisa Bashlovkina, Feng Han, Simon Baumgartner, Michael Bendersky
- Enabling Multimodal Generation On CLIP Via Vision-language Knowledge Distillation Wenliang Dai et al.
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Zerogen: Efficient Zero-shot Learning Via Dataset Generation Jiacheng Ye et al.
- Teaching Small Language Models To Reason Lucie Charlotte Magister, Jonathan Mallinson, Jakub Adamek, Eric Malmi, Aliaksei Severyn
- Distilling Reasoning Capabilities Into Smaller Language Models Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan
- Structured Pruning Learns Compact And Accurate Models Mengzhou Xia, Zexuan Zhong, Danqi Chen
- Ernie-search: Bridging Cross-encoder With Dual-encoder Via Self On-the-fly Distillation For Dense Passage Retrieval Yuxiang Lu et al.
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- Tinyclip: CLIP Distillation Via Affinity Mimicking And Weight Inheritance Kan Stephen Wu et al.
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- Can A Student Large Language Model Perform As Well As It's Teacher? Sia Gholami, Marwan Omar
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Distilled GPT For Source Code Summarization Chia-yi Su, Collin Mcmillan
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- Bad Actor, Good Advisor: Exploring The Role Of Large Language Models In Fake News Detection Beizhe Hu et al.
- A Survey On Model Compression For Large Language Models Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang
- Llm-pruner: On The Structural Pruning Of Large Language Models Xinyin Ma, Gongfan Fang, Xinchao Wang
- Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Preventing Zero-shot Transfer Degradation In Continual Learning Of Vision-language Models Zangwei Zheng et al.
- Gemma 2: Improving Open Language Models At A Practical Size Gemma Team et al.
- Optimization Methods For Personalizing Large Language Models Through Retrieval Augmentation Alireza Salemi, Surya Kallumadi, Hamed Zamani
🏷 Efficiency and Optimization
- Sequence-level Knowledge Distillation Yoon Kim, Alexander M. Rush
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- Sample-efficient Actor-critic Reinforcement Learning With Supervised Data For Dialogue Management Pei-hao Su, Pawel Budzianowski, Stefan Ultes, Milica Gasic, Steve Young
- Data Distillation For Controlling Specificity In Dialogue Generation Jiwei Li, Will Monroe, Dan Jurafsky
- End-to-end Optimization Of Goal-driven And Visually Grounded Dialogue Systems Florian Strub et al.
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Babyai: A Platform To Study The Sample Efficiency Of Grounded Language Learning Maxime Chevalier-boisvert et al.
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- Cross-lingual Language Model Pretraining Guillaume Lample, Alexis Conneau
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Fully Quantized Transformer For Machine Translation Gabriele Prato, Ella Charlaix, Mehdi Rezagholizadeh
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Mixture Content Selection For Diverse Sequence Generation Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Approximating Interactive Human Evaluation With Self-play For Open-domain Dialog Systems Asma Ghandeharioun et al.
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Single Headed Attention RNN: Stop Thinking With Your Head Stephen Merity
- Zero: Memory Optimizations Toward Training Trillion Parameter Models Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He
- Analyzing Multi-head Self-attention: Specialized Heads Do The Heavy Lifting, The Rest Can Be Pruned Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
- Levenshtein Transformer Jiatao Gu, Changhan Wang, Jake Zhao
- Visualizing And Understanding The Effectiveness Of BERT Yaru Hao, Li Dong, Furu Wei, Ke Xu
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Real-time Open-domain Question Answering With Dense-sparse Phrase Index Minjoon Seo et al.
- Synchronous Bidirectional Inference For Neural Sequence Generation Jiajun Zhang, Long Zhou, Yang Zhao, Chengqing Zong
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- A Modular Task-oriented Dialogue System Using A Neural Mixture-of-experts Jiahuan Pei, Pengjie Ren, Maarten De Rijke
- SPARTA: Efficient Open-domain Question Answering Via Sparse Transformer Matching Retrieval Tiancheng Zhao, Xiaopeng Lu, Kyusong Lee
- Bert-of-theseus: Compressing BERT By Progressive Module Replacing Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- Progressive Generation Of Long Text With Pretrained Language Models Bowen Tan, Zichao Yang, Maruan Ai-shedivat, Eric P. Xing, Zhiting Hu
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- Knowledge Distillation For Improved Accuracy In Spoken Question Answering Chenyu You, Nuo Chen, Yuexian Zou
- Few-shot Text Generation With Pattern-exploiting Training Timo Schick, Hinrich Schütze
- It's Not Just Size That Matters: Small Language Models Are Also Few-shot Learners Timo Schick, Hinrich Schütze
- Alfworld: Aligning Text And Embodied Environments For Interactive Learning Mohit Shridhar et al.
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- Scaling Laws For Neural Language Models Jared Kaplan et al.
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- Adapterdrop: On The Efficiency Of Adapters In Transformers Andreas Rücklé et al.
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Intellicode Compose: Code Generation Using Transformer Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan
- Mixkd: Towards Efficient Distillation Of Large-scale Language Models Kevin J Liang et al.
- UBAR: Towards Fully End-to-end Task-oriented Dialog Systems With GPT-2 Yunyi Yang, Yunhao Li, Xiaojun Quan
- Mintl: Minimalist Transfer Learning For Task-oriented Dialogue Systems Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, Pascale Fung
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- Trading Off Diversity And Quality In Natural Language Generation Hugh Zhang, Daniel Duckworth, Daphne Ippolito, Arvind Neelakantan
- Template Guided Text Generation For Task-oriented Dialogue Mihir Kale, Abhinav Rastogi
- Variational Transformers For Diverse Response Generation Zhaojiang Lin, Genta Indra Winata, Peng Xu, Zihan Liu, Pascale Fung
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- Length-adaptive Transformer: Train Once With Length Drop, Use Anytime With Search Gyuwan Kim, Kyunghyun Cho
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- On The Effect Of Dropping Layers Of Pre-trained Transformer Models Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- BERT Loses Patience: Fast And Robust Inference With Early Exit Wangchunshu Zhou et al.
- Funnel-transformer: Filtering Out Sequential Redundancy For Efficient Language Processing Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- Minilmv2: Multi-head Self-attention Relation Distillation For Compressing Pretrained Transformers Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei
- On Transferability Of Prompt Tuning For Natural Language Processing Yusheng Su et al.
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- Personalized Transformer For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Pretrained Transformers As Universal Computation Engines Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- Compressing Visual-linguistic Model Via Knowledge Distillation Zhiyuan Fang et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- Ext5: Towards Extreme Multi-task Scaling For Transfer Learning Vamsi Aribandi et al.
- The Impact Of Multiple Parallel Phrase Suggestions On Email Input And Composition Behaviour Of Native And Non-native English Writers Daniel Buschek, Martin Zürn, Malin Eiband
- The Stability-efficiency Dilemma: Investigating Sequence Length Warmup For Training GPT Models Conglong Li, Minjia Zhang, Yuxiong He
- MAGMA -- Multimodal Augmentation Of Generative Models Through Adapter-based Finetuning Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- UNICORN On RAINBOW: A Universal Commonsense Reasoning Model On A New Multitask Benchmark Nicholas Lourie, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- HTLM: Hyper-text Pre-training And Prompting Of Language Models Armen Aghajanyan et al.
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- What Do Pre-trained Code Models Know About Code? Anjan Karmakar, Romain Robbes
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- Demix Layers: Disentangling Domains For Modular Language Modeling Suchin Gururangan, Mike Lewis, Ari Holtzman, Noah A. Smith, Luke Zettlemoyer
- COCO-LM: Correcting And Contrasting Text Sequences For Language Model Pretraining Yu Meng et al.
- A General Language Assistant As A Laboratory For Alignment Amanda Askell et al.
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- Learned Token Pruning For Transformers Sehoon Kim et al.
- I-BERT: Integer-only BERT Quantization Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- Compacter: Efficient Low-rank Hypercomplex Adapter Layers Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder
- GALAXY: A Generative Pre-trained Model For Task-oriented Dialog With Semi-supervised Learning And Explicit Policy Injection Wanwei He et al.
- Learning To Prompt For Vision-language Models Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- MATE: Multi-view Attention For Table Transformer Efficiency Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, William W. Cohen
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Revisiting Neural Scaling Laws In Language And Vision Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
- Interactive Code Generation Via Test-driven User-intent Formalization Shuvendu K. Lahiri et al.
- CLIP-TD: CLIP Targeted Distillation For Vision-language Tasks Zhecan Wang et al.
- What Do Llms Know About Financial Markets? A Case Study On Reddit Market Sentiment Analysis Xiang Deng, Vasilisa Bashlovkina, Feng Han, Simon Baumgartner, Michael Bendersky
- Enabling Multimodal Generation On CLIP Via Vision-language Knowledge Distillation Wenliang Dai et al.
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- Smoothquant: Accurate And Efficient Post-training Quantization For Large Language Models Guangxuan Xiao et al.
- LUT-GEMM: Quantized Matrix Multiplication Based On Luts For Efficient Inference In Large-scale Generative Language Models Gunho Park et al.
- Contrastive Decoding: Open-ended Text Generation As Optimization Xiang Lisa Li et al.
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Speechprompt: An Exploration Of Prompt Tuning On Generative Spoken Language Model For Speech Processing Tasks Kai-wei Chang, Wei-cheng Tseng, Shang-wen Li, Hung-yi Lee
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Deepspeed-moe: Advancing Mixture-of-experts Inference And Training To Power Next-generation AI Scale Samyam Rajbhandari et al.
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Zerogen: Efficient Zero-shot Learning Via Dataset Generation Jiacheng Ye et al.
- Revisiting The "video" In Video-language Understanding Shyamal Buch et al.
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- Emergent Abilities Of Large Language Models Jason Wei et al.
- Teaching Small Language Models To Reason Lucie Charlotte Magister, Jonathan Mallinson, Jakub Adamek, Eric Malmi, Aliaksei Severyn
- Distilling Reasoning Capabilities Into Smaller Language Models Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan
- Is Reinforcement Learning (not) For Natural Language Processing: Benchmarks, Baselines, And Building Blocks For Natural Language Policy Optimization Rajkumar Ramamurthy et al.
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- Structured Pruning Learns Compact And Accurate Models Mengzhou Xia, Zexuan Zhong, Danqi Chen
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Mixgen: A New Multi-modal Data Augmentation Xiaoshuai Hao et al.
- Ernie-search: Bridging Cross-encoder With Dual-encoder Via Self On-the-fly Distillation For Dense Passage Retrieval Yuxiang Lu et al.
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- Hyperprompt: Prompt-based Task-conditioning Of Transformers Yun He et al.
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Visual-language Navigation Pretraining Via Prompt-based Environmental Self-exploration Xiwen Liang, Fengda Zhu, Lingling Li, Hang Xu, Xiaodan Liang
- Protoclip: Prototypical Contrastive Language Image Pretraining Delong Chen et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Prompt For Extraction? PAIE: Prompting Argument Interaction For Event Argument Extraction Yubo Ma et al.
- Mplug: Effective And Efficient Vision-language Learning By Cross-modal Skip-connections Chenliang Li et al.
- Llm-planner: Few-shot Grounded Planning For Embodied Agents With Large Language Models Chan Hee Song et al.
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- Multimodal Knowledge Alignment With Reinforcement Learning Youngjae Yu et al.
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Qaner: Prompting Question Answering Models For Few-shot Named Entity Recognition Andy T. Liu et al.
- Language Model Compression With Weighted Low-rank Factorization Yen-chang Hsu et al.
- Dual Modality Prompt Tuning For Vision-language Pre-trained Model Yinghui Xing et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- LASP: Text-to-text Optimization For Language-aware Soft Prompting Of Vision & Language Models Adrian Bulat, Georgios Tzimiropoulos
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- Contrastive Learning Reduces Hallucination In Conversations Weiwei Sun et al.
- Transformer Feed-forward Layers Build Predictions By Promoting Concepts In The Vocabulary Space Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
- Efficient Training Of Language Models To Fill In The Middle Mohammad Bavarian et al.
- Llm.int8(): 8-bit Matrix Multiplication For Transformers At Scale Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Holistic Evaluation Of Language Models Percy Liang et al.
- Llm-grounded Diffusion: Enhancing Prompt Understanding Of Text-to-image Diffusion Models With Large Language Models Long Lian, Boyi Li, Adam Yala, Trevor Darrell
- Human-ai Collaboration In Thematic Analysis Using Chatgpt: A User Study And Design Recommendations Lixiang Yan et al.
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- Automatic Prompt Augmentation And Selection With Chain-of-thought From Labeled Data Kashun Shum, Shizhe Diao, Tong Zhang
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Tinyclip: CLIP Distillation Via Affinity Mimicking And Weight Inheritance Kan Stephen Wu et al.
- Chipgpt: How Far Are We From Natural Language Hardware Design Kaiyan Chang et al.
- Speechprompt V2: Prompt Tuning For Speech Classification Tasks Kai-wei Chang et al.
- Full Parameter Fine-tuning For Large Language Models With Limited Resources Kai Lv et al.
- ALIP: Adaptive Language-image Pre-training With Synthetic Caption Kaicheng Yang et al.
- Honeybee: Locality-enhanced Projector For Multimodal LLM Junbum Cha, Wooyoung Kang, Jonghwan Mun, Byungseok Roh
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- Longnet: Scaling Transformers To 1,000,000,000 Tokens Jiayu Ding et al.
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- GPT-3.5, GPT-4, Or BARD? Evaluating Llms Reasoning Ability In Zero-shot Setting And Performance Boosting Through Prompts Jessica López Espejel, El Hassane Ettifouri, Mahaman Sanoussi Yahaya Alassan, El Mehdi Chouham, Walid Dahhane
- AWQ: Activation-aware Weight Quantization For LLM Compression And Acceleration Ji Lin et al.
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- A Comprehensive Overview Of Large Language Models Humza Naveed et al.
- Theory Of Mind For Multi-agent Collaboration Via Large Language Models Huao Li et al.
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Palm 2 Technical Report Rohan Anil et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Automatic Prompt Optimization With "gradient Descent" And Beam Search Reid Pryzant et al.
- Direct Preference Optimization: Your Language Model Is Secretly A Reward Model Rafael Rafailov et al.
- Codegeex: A Pre-trained Model For Code Generation With Multilingual Benchmarking On Humaneval-x Qinkai Zheng et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Pre-train, Prompt And Recommendation: A Comprehensive Survey Of Language Modelling Paradigm Adaptations In Recommender Systems Peng Liu, Lemei Zhang, Jon Atle Gulla
- GPT-4 Technical Report Openai et al.
- Sparse Low-rank Adaptation Of Pre-trained Language Models Ning Ding et al.
- Are Aligned Neural Networks Adversarially Aligned? Nicholas Carlini et al.
- A Simple And Effective Pruning Approach For Large Language Models Mingjie Sun, Zhuang Liu, Anna Bair, J. Zico Kolter
- Large Language Models Are Effective Text Rankers With Pairwise Ranking Prompting Zhen Qin et al.
- Empirical Study Of Zero-shot NER With Chatgpt Tingyu Xie et al.
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Spqr: A Sparse-quantized Representation For Near-lossless LLM Weight Compression Tim Dettmers et al.
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Grounding Large Language Models In Interactive Environments With Online Reinforcement Learning Thomas Carta et al.
- Emergent And Predictable Memorization In Large Language Models Stella Biderman et al.
- Can A Student Large Language Model Perform As Well As It's Teacher? Sia Gholami, Marwan Omar
- Do Generative Large Language Models Need Billions Of Parameters? Sia Gholami, Marwan Omar
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- Do Llms Understand User Preferences? Evaluating Llms On User Rating Prediction Wang-cheng Kang et al.
- Scaling Down To Scale Up: A Guide To Parameter-efficient Fine-tuning Vladislav Lialin, Vijeta Deshpande, Xiaowei Yao, Anna Rumshisky
- Automated Reading Passage Generation With Openai's Large Language Model Ummugul Bezirhan, Matthias Von Davier
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- R2gengpt: Radiology Report Generation With Frozen Llms Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou
- Autodroid: Llm-powered Task Automation In Android Hao Wen et al.
- Chain Of Hindsight Aligns Language Models With Feedback Hao Liu, Carmelo Sferrazza, Pieter Abbeel
- Languagempc: Large Language Models As Decision Makers For Autonomous Driving Hao Sha et al.
- Visual-language Prompt Tuning With Knowledge-guided Context Optimization Hantao Yao, Rui Zhang, Changsheng Xu
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Wizardmath: Empowering Mathematical Reasoning For Large Language Models Via Reinforced Evol-instruct Haipeng Luo et al.
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Cheap And Quick: Efficient Vision-language Instruction Tuning For Large Language Models Gen Luo et al.
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Chatgpt Outperforms Crowd-workers For Text-annotation Tasks Fabrizio Gilardi, Meysam Alizadeh, Maël Kubli
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Do We Still Need Clinical Language Models? Eric Lehman et al.
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- MELTR: Meta Loss Transformer For Learning To Fine-tune Video Foundation Models Dohwan Ko et al.
- Text-to-sql Empowered By Large Language Models: A Benchmark Evaluation Dawei Gao et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Whitefox: White-box Compiler Fuzzing Empowered By Large Language Models Chenyuan Yang et al.
- Chateval: Towards Better Llm-based Evaluators Through Multi-agent Debate Chi-min Chan et al.
- Distilled GPT For Source Code Summarization Chia-yi Su, Collin Mcmillan
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Dipping Plms Sauce: Bridging Structure And Text For Effective Knowledge Graph Completion Via Conditional Soft Prompting Chen Chen, Yufei Wang, Aixin Sun, Bing Li, Kwok-yan Lam
- MME: A Comprehensive Evaluation Benchmark For Multimodal Large Language Models Chaoyou Fu et al.
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Bad Actor, Good Advisor: Exploring The Role Of Large Language Models In Fake News Detection Beizhe Hu et al.
- Detecting And Preventing Hallucinations In Large Vision Language Models Anisha Gunjal, Jihan Yin, Erhan Bas
- Robots That Ask For Help: Uncertainty Alignment For Large Language Model Planners Allen Z. Ren et al.
- Large Language Models For Telecom: Forthcoming Impact On The Industry Ali Maatouk, Nicola Piovesan, Fadhel Ayed, Antonio De Domenico, Merouane Debbah
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- Mistral 7B Albert Q. Jiang et al.
- Powerinfer: Fast Large Language Model Serving With A Consumer-grade GPU Yixin Song, Zeyu Mi, Haotong Xie, Haibo Chen
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- Llm-eval: Unified Multi-dimensional Automatic Evaluation For Open-domain Conversations With Large Language Models Yen-ting Lin, Yun-nung Chen
- When Prompt-based Incremental Learning Does Not Meet Strong Pretraining Yu-ming Tang, Yi-xing Peng, Wei-shi Zheng
- A Survey On Model Compression For Large Language Models Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang
- Teaching Large Language Models To Self-debug Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Llm-pruner: On The Structural Pruning Of Large Language Models Xinyin Ma, Gongfan Fang, Xinchao Wang
- Delving Into Multimodal Prompting For Fine-grained Visual Classification Xin Jiang et al.
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Tool Learning With Foundation Models Yujia Qin et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- Preventing Zero-shot Transfer Degradation In Continual Learning Of Vision-language Models Zangwei Zheng et al.
- Large Language Models In Healthcare And Medical Domain: A Review Zabir Al Nazi, Wei Peng
- Hard Prompts Made Easy: Gradient-based Discrete Optimization For Prompt Tuning And Discovery Yuxin Wen et al.
- Billm: Pushing The Limit Of Post-training Quantization For Llms Wei Huang et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- Linrec: Linear Attention Mechanism For Long-term Sequential Recommender Systems Langming Liu et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- ORPO: Monolithic Preference Optimization Without Reference Model Jiwoo Hong, Noah Lee, James Thorne
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Gemma 2: Improving Open Language Models At A Practical Size Gemma Team et al.
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context Gemini Team et al.
- AI And Memory Wall Amir Gholami et al.
- Optimization Methods For Personalizing Large Language Models Through Retrieval Augmentation Alireza Salemi, Surya Kallumadi, Hamed Zamani
- A Survey On Lora Of Large Language Models Yuren Mao et al.
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
- Llamafactory: Unified Efficient Fine-tuning Of 100+ Language Models Yaowei Zheng et al.
- Biomistral: A Collection Of Open-source Pretrained Large Language Models For Medical Domains Yanis Labrak et al.
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Mgte: Generalized Long-context Text Representation And Reranking Models For Multilingual Text Retrieval Xin Zhang et al.
- Searching For Best Practices In Retrieval-augmented Generation Xiaohua Wang et al.
🏷 EMNLP
🏷 Ethics and Bias
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- Topic Aware Neural Response Generation Chen Xing et al.
- Steering Output Style And Topic In Neural Response Generation Di Wang, Nebojsa Jojic, Chris Brockett, Eric Nyberg
- A Unified Query-based Generative Model For Question Generation And Question Answering Linfeng Song, Zhiguo Wang, Wael Hamza
- Language Gans Falling Short Massimo Caccia et al.
- An Affect-rich Neural Conversational Model With Biased Attention And Weighted Cross-entropy Loss Peixiang Zhong, Di Wang, Chunyan Miao
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Evaluating Text Gans As Language Models Guy Tevet, Gavriel Habib, Vered Shwartz, Jonathan Berant
- Recosa: Detecting The Relevant Contexts With Self-attention For Multi-turn Dialogue Generation Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng
- Attention Is Not Explanation Sarthak Jain, Byron C. Wallace
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Good-enough Compositional Data Augmentation Jacob Andreas
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Universal Adversarial Triggers For Attacking And Analyzing NLP Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Exbert: A Visual Analysis Tool To Explore Learned Representations In Transformers Models Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann
- Winogrande: An Adversarial Winograd Schema Challenge At Scale Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- Scheduled Sampling For Transformers Tsvetomila Mihaylova, André F. T. Martins
- Berts Of A Feather Do Not Generalize Together: Large Variability In Generalization Across Models With Similar Test Set Performance R. Thomas Mccoy, Junghyun Min, Tal Linzen
- Inducing Brain-relevant Bias In Natural Language Processing Models Dan Schwartz, Mariya Toneva, Leila Wehbe
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Controlling The Output Length Of Neural Machine Translation Surafel Melaku Lakew, Mattia Di Gangi, Marcello Federico
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Semantically Conditioned Dialog Response Generation Via Hierarchical Disentangled Self-attention Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, William Yang Wang
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- Modifying Memories In Transformer Models Chen Zhu et al.
- Modelling Hierarchical Structure Between Dialogue Policy And Natural Language Generator With Option Framework For Task-oriented Dialogue System Jianhong Wang, Yuan Zhang, Tae-kyun Kim, Yunjie Gu
- Artificial Intelligence Versus Maya Angelou: Experimental Evidence That People Cannot Differentiate Ai-generated From Human-written Poetry Nils Köbis, Luca Mossink
- Reducing Gender Bias In Neural Machine Translation As A Domain Adaptation Problem Danielle Saunders, Bill Byrne
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- Gedi: Generative Discriminator Guided Sequence Generation Ben Krause et al.
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- Recipes For Safety In Open-domain Chatbots Jing Xu et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- Facts As Experts: Adaptable And Interpretable Neural Memory Over Symbolic Knowledge Pat Verga, Haitian Sun, Livio Baldini Soares, William W. Cohen
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- Residual Energy-based Models For Text Generation Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'aurelio Ranzato
- Can You Put It All Together: Evaluating Conversational Agents' Ability To Blend Skills Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-lan Boureau
- Mitigating Gender Bias For Neural Dialogue Generation With Adversarial Learning Haochen Liu et al.
- Generate Natural Language Explanations For Recommendation Hanxiong Chen, Xu Chen, Shaoyun Shi, Yongfeng Zhang
- Bias Out-of-the-box: An Empirical Analysis Of Intersectional Occupational Biases In Popular Generative Language Models Hannah Kirk et al.
- G-transformer For Document-level Machine Translation Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo
- Cutting Down On Prompts And Parameters: Simple Few-shot Learning With Language Models Robert L. Iv Logan et al.
- Mitigating Political Bias In Language Models Through Reinforced Calibration Ruibo Liu et al.
- Process For Adapting Language Models To Society (PALMS) With Values-targeted Datasets Irene Openai Solaiman, Christy Openai Dennison
- Revealing Persona Biases In Dialogue Systems Emily Sheng, Josh Arnold, Zhou Yu, Kai-wei Chang, Nanyun Peng
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Improving Gender Fairness Of Pre-trained Language Models Without Catastrophic Forgetting Zahra Fatemi, Chen Xing, Wenhao Liu, Caiming Xiong
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- Meta-learning Via Language Model In-context Tuning Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- AI Chains: Transparent And Controllable Human-ai Interaction By Chaining Large Language Model Prompts Tongshuang Wu, Michael Terry, Carrie J. Cai
- Multilingual LAMA: Investigating Knowledge In Multilingual Pretrained Language Models Nora Kassner, Philipp Dufter, Hinrich Schütze
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- Worst Of Both Worlds: Biases Compound In Pre-trained Vision-and-language Models Tejas Srinivasan, Yonatan Bisk
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- Visqa: X-raying Vision And Language Reasoning In Transformers Theo Jaunet et al.
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Challenges In Detoxifying Language Models Johannes Welbl et al.
- Using Adversarial Attacks To Reveal The Statistical Bias In Machine Reading Comprehension Models Jieyu Lin, Jiajie Zou, Nai Ding
- Code Structure Guided Transformer For Source Code Summarization Shuzheng Gao et al.
- Variational Information Bottleneck For Effective Low-resource Fine-tuning Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Scaling Language Models: Methods, Analysis & Insights From Training Gopher Jack W. Rae et al.
- Redditbias: A Real-world Resource For Bias Evaluation And Debiasing Of Conversational Language Models Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
- Few-shot Knowledge Graph-to-text Generation With Pretrained Language Models Junyi Li et al.
- MATE: Multi-view Attention For Table Transformer Efficiency Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, William W. Cohen
- Perturbation Augmentation For Fairer NLP Rebecca Qian et al.
- Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored To Political Identity Gabriel Simmons
- Large Language Models Encode Clinical Knowledge Karan Singhal et al.
- Dall-eval: Probing The Reasoning Skills And Social Biases Of Text-to-image Generation Models Jaemin Cho, Abhay Zala, Mohit Bansal
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- Shortcut Learning Of Large Language Models In Natural Language Understanding Mengnan Du, Fengxiang He, Na Zou, Dacheng Tao, Xia Hu
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Capturing Failures Of Large Language Models Via Human Cognitive Biases Erik Jones, Jacob Steinhardt
- Prompt Distribution Learning Yuning Lu, Jianzhuang Liu, Yonggang Zhang, Yajing Liu, Xinmei Tian
- Cont: Contrastive Neural Text Generation Chenxin An et al.
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- BLOOM: A 176b-parameter Open-access Multilingual Language Model Bigscience Workshop et al.
- Improving Alignment Of Dialogue Agents Via Targeted Human Judgements Amelia Glaese et al.
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models Aarohi Shammie Srivastava et al.
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- On Second Thought, Let's Not Think Step By Step! Bias And Toxicity In Zero-shot Reasoning Omar Shaikh, Hongxin Zhang, William Held, Michael Bernstein, Diyi Yang
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- Quantifying Memorization Across Neural Language Models Nicholas Carlini et al.
- Co-writing Screenplays And Theatre Scripts With Language Models: An Evaluation By Industry Professionals Piotr Mirowski, Kory W. Mathewson, Jaylen Pittman, Richard Evans
- Holistic Evaluation Of Language Models Percy Liang et al.
- A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets Md Tahmid Rahman Laskar et al.
- Practical And Ethical Challenges Of Large Language Models In Education: A Systematic Scoping Review Lixiang Yan et al.
- Do Llms Exhibit Human-like Response Biases? A Case Study In Survey Design Lindia Tjuatja, Valerie Chen, Sherry Tongshuang Wu, Ameet Talwalkar, Graham Neubig
- Judging Llm-as-a-judge With Mt-bench And Chatbot Arena Lianmin Zheng et al.
- In-context Impersonation Reveals Large Language Models' Strengths And Biases Leonard Salewski, Stephan Alaniz, Isabel Rio-torto, Eric Schulz, Zeynep Akata
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Backdooring Instruction-tuned Large Language Models With Virtual Prompt Injection Jun Yan et al.
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Towards Llm-based Autograding For Short Textual Answers Johannes Schneider, Bernd Schenk, Christina Niklaus
- LEXTREME: A Multi-lingual And Multi-task Benchmark For The Legal Domain Joel Niklaus et al.
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Graphix-t5: Mixing Pre-trained Transformers With Graph-aware Layers For Text-to-sql Parsing Jinyang Li et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- Ethical Chatgpt: Concerns, Challenges, And Commandments Jianlong Zhou, Heimo Müller, Andreas Holzinger, Fang Chen
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- The Bigscience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Hugo Laurençon et al.
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Chain-of-verification Reduces Hallucination In Large Language Models Shehzaad Dhuliawala et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- The Moral Authority Of Chatgpt Sebastian Krügel, Andreas Ostermaier, Matthias Uhl
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- Palm 2 Technical Report Rohan Anil et al.
- Designerly Understanding: Information Needs For Model Transparency To Support Design Ideation For Ai-powered User Experience Q. Vera Liao, Hariharan Subramonyam, Jennifer Wang, Jennifer Wortman Vaughan
- Prompting The Hidden Talent Of Web-scale Speech Models For Zero-shot Task Generalization Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath
- AI Transparency In The Age Of Llms: A Human-centered Research Roadmap Q. Vera Liao, Jennifer Wortman Vaughan
- Large Language Models Sensitivity To The Order Of Options In Multiple-choice Questions Pouya Pezeshkpour, Estevam Hruschka
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- Starcoder: May The Source Be With You! Raymond Li et al.
- Sources Of Hallucination By Large Language Models On Inference Tasks Nick Mckenna et al.
- Large Language Models Are Zero-shot Time Series Forecasters Nate Gruver, Marc Finzi, Shikai Qiu, Andrew Gordon Wilson
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- Creating Trustworthy Llms: Dealing With Hallucinations In Healthcare AI Muhammad Aurangzeb Ahmad, Ilker Yaramis, Taposh Dutta Roy
- Open Sesame! Universal Black Box Jailbreaking Of Large Language Models Raz Lapid, Ron Langberg, Moshe Sipper
- Selenite: Scaffolding Online Sensemaking With Comprehensive Overviews Elicited From Large Language Models Michael Xieyang Liu et al.
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- Red Teaming Chatgpt Via Jailbreaking: Bias, Robustness, Reliability And Toxicity Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- Chatgpt Perpetuates Gender Bias In Machine Translation And Ignores Non-gendered Pronouns: Findings Across Bengali And Five Other Low-resource Languages Sourojit Ghosh, Aylin Caliskan
- Pythia: A Suite For Analyzing Large Language Models Across Training And Scaling Stella Biderman et al.
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- Mitigating Object Hallucinations In Large Vision-language Models Through Visual Contrastive Decoding Sicong Leng et al.
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Gender Bias And Stereotypes In Large Language Models Hadas Kotek, Rikker Dockum, David Q. Sun
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Text Matching Improves Sequential Recommendation By Reducing Popularity Biases Zhenghao Liu et al.
- Do Large Language Models Show Decision Heuristics Similar To Humans? A Case Study Using GPT-3.5 Gaurav Suri, Lily R. Slater, Ali Ziaee, Morgan Nguyen
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Assigning AI: Seven Approaches For Students, With Prompts Ethan Mollick, Lilach Mollick
- Lasuie: Unifying Information Extraction With Latent Adaptive Structure-aware Generative Language Model Hao Fei et al.
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- The Capacity For Moral Self-correction In Large Language Models Deep Ganguli et al.
- Debiasing Vision-language Models Via Biased Prompts Ching-yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka
- Supporting Human-ai Collaboration In Auditing Llms With Llms Charvi Rastogi, Marco Tulio Ribeiro, Nicholas King, Harsha Nori, Saleema Amershi
- Chatgpt And A New Academic Reality: Artificial Intelligence-written Research Papers And The Ethics Of The Large Language Models In Scholarly Publishing Brady Lund et al.
- A Short Survey Of Viewing Large Language Models In Legal Aspect Zhongxiang Sun
- Friend Or Foe? Exploring The Implications Of Large Language Models On The Science System Benedikt Fecher, Marcel Hebing, Melissa Laufer, Jörg Pohle, Fabian Sofsky
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- Med-halt: Medical Domain Hallucination Test For Large Language Models Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Toxicity In Chatgpt: Analyzing Persona-assigned Language Models Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan
- A Categorical Archive Of Chatgpt Failures Ali Borji
- Language Model Tokenizers Introduce Unfairness Between Languages Aleksandar Petrov, Emanuele La Malfa, Philip H. S. Torr, Adel Bibi
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Chatgpt: More Than A Weapon Of Mass Deception, Ethical Challenges And Responses From The Human-centered Artificial Intelligence (HCAI) Perspective Alejo Jose G. Sison, Marco Tulio Daza, Roberto Gozalo-brizuela, Eduardo C. Garrido-merchán
- "kelly Is A Warm Person, Joseph Is A Role Model": Gender Biases In Llm-generated Reference Letters Yixin Wan et al.
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- Assessing Cross-cultural Alignment Between Chatgpt And Human Societies: An Empirical Study Yong Cao et al.
- G-eval: NLG Evaluation Using GPT-4 With Better Human Alignment Yang Liu et al.
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Large Language Models Are Zero-shot Rankers For Recommender Systems Yupeng Hou et al.
- An Empirical Study Of Catastrophic Forgetting In Large Language Models During Continual Fine-tuning Yun Luo et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- Aligning Large Language Models With Human: A Survey Yufei Wang et al.
- Textbooks Are All You Need II: Phi-1.5 Technical Report Yuanzhi Li et al.
- Mapping The Ethics Of Generative AI: A Comprehensive Scoping Review Thilo Hagendorff
- Chatgpt As Research Scientist: Probing Gpt's Capabilities As A Research Librarian, Research Ethicist, Data Generator And Data Predictor Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R. Banaji
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- Fine-tuned Language Models Generate Stable Inorganic Materials As Text Nate Gruver et al.
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Closing The Gap Between Open-source And Commercial Large Language Models For Medical Evidence Summarization Gongbo Zhang et al.
- Olmo: Accelerating The Science Of Language Models Dirk Groeneveld et al.
- Open Source Language Models Can Provide Feedback: Evaluating Llms' Ability To Help Students Using Gpt-4-as-a-judge Charles Koutcheme et al.
- Large Language Models And User Trust: Consequence Of Self-referential Learning Loop And The Deskilling Of Healthcare Professionals Avishek Choudhury, Zaria Chaudhry
- Why And When Llm-based Assistants Can Go Wrong: Investigating The Effectiveness Of Prompt-based Interactions For Software Help-seeking Anjali Khurana, Hari Subramonyam, Parmit K Chilana
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Sora: A Review On Background, Technology, Limitations, And Opportunities Of Large Vision Models Yixin Liu et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
- Measurement Of Llm's Philosophies Of Human Nature Minheng Ni et al.
🏷 Evaluation
- Topic Aware Neural Response Generation Chen Xing et al.
- Separating Answers From Queries For Neural Reading Comprehension Dirk Weissenborn
- The LAMBADA Dataset: Word Prediction Requiring A Broad Discourse Context Denis Paperno et al.
- Steering Output Style And Topic In Neural Response Generation Di Wang, Nebojsa Jojic, Chris Brockett, Eric Nyberg
- Attention Strategies For Multi-source Sequence-to-sequence Learning Jindřich Libovický, Jindřich Helcl
- Neural Personalized Response Generation As Domain Adaptation Weinan Zhang, Ting Liu, Yifa Wang, Qingfu Zhu
- Adversarial Learning For Neural Dialogue Generation Jiwei Li et al.
- Ask The Right Questions: Active Question Reformulation With Reinforcement Learning Christian Buck et al.
- Searchqa: A New Q&A Dataset Augmented With Context From A Search Engine Matthew Dunn et al.
- Latent Intention Dialogue Models Tsung-hsien Wen, Yishu Miao, Phil Blunsom, Steve Young
- Neural Response Generation With Dynamic Vocabularies Yu Wu et al.
- Parlai: A Dialog Research Software Platform Alexander H. Miller et al.
- Maskgan: Better Text Generation Via Filling In The______ William Fedus, Ian Goodfellow, Andrew M. Dai
- Dialogue Generation: From Imitation Learning To Inverse Reinforcement Learning Ziming Li, Julia Kiseleva, Maarten De Rijke
- Fast Abstractive Summarization With Reinforce-selected Sentence Rewriting Yen-chun Chen, Mohit Bansal
- The Memad Submission To The WMT18 Multimodal Translation Task Stig-arne Grönroos et al.
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Character-level Language Modeling With Deeper Self-attention Rami Al-rfou, Dokook Choe, Noah Constant, Mandy Guo, Llion Jones
- Commonsense For Generative Multi-hop Question Answering Tasks Lisa Bauer, Yicheng Wang, Mohit Bansal
- Adversarial Over-sensitivity And Over-stability Strategies For Dialogue Models Tong Niu, Mohit Bansal
- A Retrieve-and-edit Framework For Predicting Structured Outputs Tatsunori B. Hashimoto, Kelvin Guu, Yonatan Oren, Percy Liang
- Advancing The State Of The Art In Open Domain Dialog Systems Through The Alexa Prize Chandra Khatri et al.
- Wizard Of Wikipedia: Knowledge-powered Conversational Agents Emily Dinan et al.
- Language Gans Falling Short Massimo Caccia et al.
- Retrieve And Refine: Improved Sequence Generation Models For Dialogue Jason Weston, Emily Dinan, Alexander H. Miller
- Sdnet: Contextualized Attention-based Deep Network For Conversational Question Answering Chenguang Zhu, Michael Zeng, Xuedong Huang
- Sentence Encoders On Stilts: Supplementary Training On Intermediate Labeled-data Tasks Jason Phang, Thibault Févry, Samuel R. Bowman
- An Affect-rich Neural Conversational Model With Biased Attention And Weighted Cross-entropy Loss Peixiang Zhong, Di Wang, Chunyan Miao
- A Dataset For Document Grounded Conversations Kangyan Zhou, Shrimai Prabhumoye, Alan W Black
- Hybrid Retrieval-generation Reinforced Agent For Medical Image Report Generation Christy Y. Li, Xiaodan Liang, Zhiting Hu, Eric P. Xing
- Retrieval-enhanced Adversarial Training For Neural Response Generation Qingfu Zhu, Lei Cui, Weinan Zhang, Furu Wei, Ting Liu
- Controllable Neural Story Plot Generation Via Reward Shaping Pradyumna Tambwekar et al.
- Disentangling Language And Knowledge In Task-oriented Dialogs Dinesh Raghu, Nikhil Gupta, Mausam
- Generating Informative And Diverse Conversational Responses Via Adversarial Information Maximization Yizhe Zhang et al.
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Topic-based Evaluation For Conversational Bots Fenfei Guo et al.
- Towards Empathetic Open-domain Conversation Models: A New Benchmark And Dataset Hannah Rashkin, Eric Michael Smith, Margaret Li, Y-lan Boureau
- "bilingual Expert" Can Find Translation Errors Kai Fan et al.
- Multi-cast Attention Networks For Retrieval-based Question Answering And Response Prediction Yi Tay, Luu Anh Tuan, Siu Cheung Hui
- On Evaluating And Comparing Open Domain Dialog Systems Anu Venkatesh et al.
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Conversational AI: The Science Behind The Alexa Prize Ashwin Ram et al.
- Evaluating Text Gans As Language Models Guy Tevet, Gavriel Habib, Vered Shwartz, Jonathan Berant
- Multi-passage BERT: A Globally Normalized BERT Model For Open-domain Question Answering Zhiguo Wang, Patrick Ng, Xiaofei Ma, Ramesh Nallapati, Bing Xiang
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Ensemble-based Deep Reinforcement Learning For Chatbots Heriberto Cuayáhuitl et al.
- How Can We Know What Language Models Know? Zhengbao Jiang, Frank F. Xu, Jun Araki, Graham Neubig
- Recosa: Detecting The Relevant Contexts With Self-attention For Multi-turn Dialogue Generation Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- Conversing By Reading: Contentful Neural Conversation With On-demand Machine Reading Lianhui Qin et al.
- Moverscore: Text Generation Evaluating With Contextualized Embeddings And Earth Mover Distance Wei Zhao et al.
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Reqa: An Evaluation For End-to-end Answer Retrieval Models Amin Ahmad, Noah Constant, Yinfei Yang, Daniel Cer
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Bert4rec: Sequential Recommendation With Bidirectional Encoder Representations From Transformer Fei Sun et al.
- Transformer-xl: Attentive Language Models Beyond A Fixed-length Context Zihang Dai et al.
- Align, Mask And Select: A Simple Method For Incorporating Commonsense Knowledge Into Language Representation Models Zhi-xiu Ye, Qian Chen, Wen Wang, Zhen-hua Ling
- Olmpics -- On What Language Model Pre-training Captures Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant
- Data-to-text Generation With Entity Modeling Ratish Puduppully, Li Dong, Mirella Lapata
- Counterfactual Story Reasoning And Generation Lianhui Qin et al.
- Generating Persona Consistent Dialogues By Exploiting Natural Language Inference Haoyu Song, Wei-nan Zhang, Jingwen Hu, Ting Liu
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- ELI5: Long Form Question Answering Angela Fan et al.
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- PEGASUS: Pre-training With Extracted Gap-sentences For Abstractive Summarization Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- Exploiting Persona Information For Diverse Generation Of Conversational Responses Haoyu Song, Wei-nan Zhang, Yiming Cui, Dong Wang, Ting Liu
- Dykgchat: Benchmarking Dialogue Generation Grounding On Dynamic Knowledge Graphs Yi-lin Tuan, Yun-nung Chen, Hung-yi Lee
- Sticking To The Facts: Confident Decoding For Faithful Data-to-text Generation Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh
- Universal Adversarial Triggers For Attacking And Analyzing NLP Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh
- Approximating Interactive Human Evaluation With Self-play For Open-domain Dialog Systems Asma Ghandeharioun et al.
- Compressive Transformers For Long-range Sequence Modelling Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- Consistent Dialogue Generation With Self-supervised Feature Learning Yizhe Zhang et al.
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Pythia: Ai-assisted Code Completion System Alexey Svyatkovskiy, Ying Zhao, Shengyu Fu, Neel Sundaresan
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Pay Less Attention With Lightweight And Dynamic Convolutions Felix Wu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli
- A Pre-training Based Personalized Dialogue Generation Model With Persona-sparse Data Yinhe Zheng, Rongsheng Zhang, Xiaoxi Mao, Minlie Huang
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- Training Neural Response Selection For Task-oriented Dialogue Systems Matthew Henderson et al.
- Cloze-driven Pretraining Of Self-attention Networks Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Frustratingly Easy Natural Question Answering Lin Pan et al.
- Sentence-level Content Planning And Style Specification For Neural Text Generation Xinyu Hua, Lu Wang
- Modeling Graph Structure In Transformer For Better Amr-to-text Generation Jie Zhu et al.
- Incorporating External Knowledge Into Machine Reading For Generative Question Answering Bin Bi et al.
- Commongen: A Constrained Text Generation Challenge For Generative Commonsense Reasoning Bill Yuchen Lin et al.
- Transfertransfo: A Transfer Learning Approach For Neural Network Based Conversational Agents Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
- The Second Conversational Intelligence Challenge (convai2) Emily Dinan et al.
- An Effective Domain Adaptive Post-training Method For BERT In Response Selection Taesun Whang et al.
- Rankqa: Neural Question Answering With Answer Re-ranking Bernhard Kratzwald, Anna Eigenmann, Stefan Feuerriegel
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- VL-BERT: Pre-training Of Generic Visual-linguistic Representations Weijie Su et al.
- Winogrande: An Adversarial Winograd Schema Challenge At Scale Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Codegru: Context-aware Deep Learning With Gated Recurrent Unit For Source Code Modeling Yasir Hussain, Zhiqiu Huang, Yu Zhou, Senzhang Wang
- On The Use Of BERT For Neural Machine Translation Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
- Personalizing Dialogue Agents Via Meta-learning Zhaojiang Lin, Andrea Madotto, Chien-sheng Wu, Pascale Fung
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Multi-hop Question Answering Via Reasoning Chains Jifan Chen, Shih-ting Lin, Greg Durrett
- Deepcopy: Grounded Response Generation With Hierarchical Pointer Networks Semih Yavuz, Abhinav Rastogi, Guan-lin Chao, Dilek Hakkani-tur
- Berts Of A Feather Do Not Generalize Together: Large Variability In Generalization Across Models With Similar Test Set Performance R. Thomas Mccoy, Junghyun Min, Tal Linzen
- Data Augmentation For BERT Fine-tuning In Open-domain Question Answering Wei Yang et al.
- Stabilizing Transformers For Reinforcement Learning Emilio Parisotto et al.
- Learning And Evaluating General Linguistic Intelligence Dani Yogatama et al.
- Dialogpt: Large-scale Generative Pre-training For Conversational Response Generation Yizhe Zhang et al.
- Improving Neural Response Diversity With Frequency-aware Cross-entropy Loss Shaojie Jiang, Pengjie Ren, Christof Monz, Maarten De Rijke
- Parameter-efficient Transfer Learning For NLP Neil Houlsby et al.
- Automatic Spanish Translation Of The Squad Dataset For Multilingual Question Answering Casimiro Pio Carrino, Marta R. Costa-jussà, José A. R. Fonollosa
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Bertscore: Evaluating Text Generation With BERT Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, Yoav Artzi
- Attentive History Selection For Conversational Question Answering Chen Qu et al.
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- Augmenting Self-attention With Persistent Memory Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Herve Jegou, Armand Joulin
- Barack's Wife Hillary: Using Knowledge-graphs For Fact-aware Language Modeling Robert L. Iv Logan, Nelson F. Liu, Matthew E. Peters, Matt Gardner, Sameer Singh
- Learning And Evaluating Contextual Embedding Of Source Code Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
- Plug And Play Language Models: A Simple Approach To Controlled Text Generation Sumanth Dathathri et al.
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- MLQA: Evaluating Cross-lingual Extractive Question Answering Patrick Lewis, Barlas Oğuz, Ruty Rinott, Sebastian Riedel, Holger Schwenk
- What Makes A Good Conversation? How Controllable Attributes Affect Human Judgments Abigail See, Stephen Roller, Douwe Kiela, Jason Weston
- Jointly Optimizing Diversity And Relevance In Neural Response Generation Xiang Gao et al.
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Adapting And Evaluating A Deep Learning Language Model For Clinical Why-question Answering Andrew Wen, Mohamed Y. Elwazir, Sungrim Moon, Jungwei Fan
- Do Massively Pretrained Language Models Make Better Storytellers? Abigail See, Aneesh Pappu, Rohun Saxena, Akhila Yerukola, Christopher D. Manning
- Cosmos QA: Machine Reading Comprehension With Contextual Commonsense Reasoning Lifu Huang, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Semantically Conditioned Dialog Response Generation Via Hierarchical Disentangled Self-attention Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, William Yang Wang
- Learning To Select Knowledge For Response Generation In Dialog Systems Rongzhong Lian, Min Xie, Fan Wang, Jinhua Peng, Hua Wu
- Juice: A Large Scale Distantly Supervised Dataset For Open Domain Context-based Code Generation Rajas Agashe, Srinivasan Iyer, Luke Zettlemoyer
- Bridging The Gap For Tokenizer-free Language Models Dokook Choe, Rami Al-rfou, Mandy Guo, Heeyoung Lee, Noah Constant
- Using Natural Language For Reward Shaping In Reinforcement Learning Prasoon Goyal, Scott Niekum, Raymond J. Mooney
- Gmail Smart Compose: Real-time Assisted Writing Mia Xu Chen et al.
- ACUTE-EVAL: Improved Dialogue Evaluation With Optimized Questions And Multi-turn Comparisons Margaret Li, Jason Weston, Stephen Roller
- Real-time Open-domain Question Answering With Dense-sparse Phrase Index Minjoon Seo et al.
- Context-aware Learning For Neural Machine Translation Sébastien Jean, Kyunghyun Cho
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- Microsoft Translator At WMT 2019: Towards Large-scale Document-level Neural Machine Translation Marcin Junczys-dowmunt
- ALBERT: A Lite BERT For Self-supervised Learning Of Language Representations Zhenzhong Lan et al.
- Improving Knowledge-aware Dialogue Generation Via Knowledge Base Question Answering Jian Wang et al.
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- A Modular Task-oriented Dialogue System Using A Neural Mixture-of-experts Jiahuan Pei, Pengjie Ren, Maarten De Rijke
- TANDA: Transfer And Adapt Pre-trained Transformer Models For Answer Sentence Selection Siddhant Garg, Thuy Vu, Alessandro Moschitti
- Exploring The Limits Of Transfer Learning With A Unified Text-to-text Transformer Colin Raffel et al.
- Robust Navigation With Language Pretraining And Stochastic Sampling Xiujun Li et al.
- Sg-net: Syntax-guided Machine Reading Comprehension Zhuosheng Zhang et al.
- Modifying Memories In Transformer Models Chen Zhu et al.
- Controlled Hallucinations: Learning To Generate Faithfully From Noisy Data Katja Filippova
- Modelling Hierarchical Structure Between Dialogue Policy And Natural Language Generator With Option Framework For Task-oriented Dialogue System Jianhong Wang, Yuan Zhang, Tae-kyun Kim, Yunjie Gu
- XGLUE: A New Benchmark Dataset For Cross-lingual Pre-training, Understanding And Generation Yaobo Liang et al.
- Bert-of-theseus: Compressing BERT By Progressive Module Replacing Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou
- Sequential Latent Knowledge Selection For Knowledge-grounded Dialogue Byeongchang Kim, Jaewoo Ahn, Gunhee Kim
- Russiansuperglue: A Russian Language Understanding Evaluation Benchmark Tatiana Shavrina et al.
- Progressive Generation Of Long Text With Pretrained Language Models Bowen Tan, Zichao Yang, Maruan Ai-shedivat, Eric P. Xing, Zhiting Hu
- GO FIGURE: A Meta Evaluation Of Factuality In Summarization Saadia Gabriel, Asli Celikyilmaz, Rahul Jha, Yejin Choi, Jianfeng Gao
- Knowledge-driven Data Construction For Zero-shot Evaluation In Commonsense Question Answering Kaixin Ma et al.
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- Fine-tuning Pretrained Language Models: Weight Initializations, Data Orders, And Early Stopping Jesse Dodge et al.
- Towards A Human-like Open-domain Chatbot Daniel Adiwardana et al.
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- Reducing Gender Bias In Neural Machine Translation As A Domain Adaptation Problem Danielle Saunders, Bill Byrne
- Conversational Question Reformulation Via Sequence-to-sequence Architectures And Pretrained Language Models Sheng-chieh Lin et al.
- Intermediate-task Transfer Learning With Pretrained Models For Natural Language Understanding: When And Why Does It Work? Yada Pruksachatkun et al.
- WT5?! Training Text-to-text Models To Explain Their Predictions Sharan Narang et al.
- Pretrained Transformers For Simple Question Answering Over Knowledge Graphs D. Lukovnikov, A. Fischer, J. Lehmann
- Dense Passage Retrieval For Open-domain Question Answering Vladimir Karpukhin et al.
- Speaker-aware BERT For Multi-turn Response Selection In Retrieval-based Chatbots Jia-chen Gu et al.
- Unnatural Language Inference Koustuv Sinha, Prasanna Parthasarathi, Joelle Pineau, Adina Williams
- Detecting Hallucinated Content In Conditional Neural Sequence Generation Chunting Zhou et al.
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- Mapping Natural Language Instructions To Mobile UI Action Sequences Yang Li, Jiacong He, Xin Zhou, Yuan Zhang, Jason Baldridge
- Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies For Multi-turn Response Selection Taesun Whang et al.
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- Masking As An Efficient Alternative To Finetuning For Pretrained Language Models Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze
- Meaningful Answer Generation Of E-commerce Question-answering Shen Gao, Xiuying Chen, Zhaochun Ren, Dongyan Zhao, Rui Yan
- Code Prediction By Feeding Trees To Transformers Seohyun Kim, Jinman Zhao, Yuchi Tian, Satish Chandra
- Fine-tuning Pre-trained Language Model With Weak Supervision: A Contrastive-regularized Self-training Approach Yue Yu et al.
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- A Knowledge-enhanced Pretraining Model For Commonsense Story Generation Jian Guan, Fei Huang, Zhihao Zhao, Xiaoyan Zhu, Minlie Huang
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- Gpt-too: A Language-model-first Approach For Amr-to-text Generation Manuel Mager et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- KG-BART: Knowledge Graph-augmented BART For Generative Commonsense Reasoning Ye Liu, Yao Wan, Lifang He, Hao Peng, Philip S. Yu
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Knowledge-grounded Dialogue Generation With Pre-trained Language Models Xueliang Zhao et al.
- Towards Learning A Generic Agent For Vision-and-language Navigation Via Pre-training Weituo Hao, Chunyuan Li, Xiujun Li, Lawrence Carin, Jianfeng Gao
- Contextualized Perturbation For Textual Adversarial Attack Dianqi Li et al.
- IART: Intent-aware Response Ranking With Transformers In Information-seeking Conversation Systems Liu Yang et al.
- Low-resource Knowledge-grounded Dialogue Generation Xueliang Zhao et al.
- Zero-resource Knowledge-grounded Dialogue Generation Linxiao Li et al.
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- A Simple But Tough-to-beat Data Augmentation Approach For Natural Language Understanding And Generation Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- Alfworld: Aligning Text And Embodied Environments For Interactive Learning Mohit Shridhar et al.
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- M3P: Learning Universal Representations Via Multitask Multilingual Multimodal Pre-training Minheng Ni et al.
- A Simple Language Model For Task-oriented Dialogue Ehsan Hosseini-asl, Bryan Mccann, Chien-sheng Wu, Semih Yavuz, Richard Socher
- GRUEN For Evaluating Linguistic Quality Of Generated Text Wanzheng Zhu, Suma Bhat
- Rapidly Bootstrapping A Question Answering Dataset For COVID-19 Raphael Tang et al.
- PLATO-2: Towards Building An Open-domain Chatbot Via Curriculum Learning Siqi Bao et al.
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- Long Range Arena: A Benchmark For Efficient Transformers Yi Tay et al.
- Tabert: Pretraining For Joint Understanding Of Textual And Tabular Data Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel
- PONE: A Novel Automatic Evaluation Metric For Open-domain Generative Dialogue Systems Tian Lan, Xian-ling Mao, Wei Wei, Xiaoyan Gao, Heyan Huang
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- Few-shot Natural Language Generation For Task-oriented Dialog Baolin Peng et al.
- DUMA: Reading Comprehension With Transposition Thinking Pengfei Zhu, Hai Zhao, Xiaoguang Li
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- Robust Conversational AI With Grounded Text Generation Jianfeng Gao et al.
- The Turking Test: Can Language Models Understand Instructions? Avia Efrat, Omer Levy
- Synthesizer: Rethinking Self-attention In Transformer Models Yi Tay et al.
- Bert-hlstms: BERT And Hierarchical Lstms For Visual Storytelling Jing Su, Qingyun Dai, Frank Guerin, Mian Zhou
- Recipes For Safety In Open-domain Chatbots Jing Xu et al.
- Beyond I.I.D.: Three Levels Of Generalization For Question Answering On Knowledge Bases Yu Gu et al.
- Genaug: Data Augmentation For Finetuning Text Generators Steven Y. Feng, Varun Gangal, Dongyeop Kang, Teruko Mitamura, Eduard Hovy
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Recipes For Building An Open-domain Chatbot Stephen Roller et al.
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- MEGATRON-CNTRL: Controllable Story Generation With External Knowledge Using Large-scale Language Models Peng Xu et al.
- MART: Memory-augmented Recurrent Transformer For Coherent Video Paragraph Captioning Jie Lei et al.
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- Better Fine-tuning By Reducing Representational Collapse Armen Aghajanyan et al.
- Generative Data Augmentation For Commonsense Reasoning Yiben Yang et al.
- Autoprompt: Eliciting Knowledge From Language Models With Automatically Generated Prompts Taylor Shin, Yasaman Razeghi, Robert L. Iv Logan, Eric Wallace, Sameer Singh
- Syntactic Data Augmentation Increases Robustness To Inference Heuristics Junghyun Min, R. Thomas Mccoy, Dipanjan Das, Emily Pitler, Tal Linzen
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- Recall And Learn: Fine-tuning Deep Pretrained Language Models With Less Forgetting Sanyuan Chen et al.
- Coregen: Contextualized Code Representation Learning For Commit Message Generation Lun Yiu Nie et al.
- Auto-captions On GIF: A Large-scale Video-sentence Dataset For Vision-language Pre-training Yingwei Pan et al.
- Fine-tuning BERT For Schema-guided Zero-shot Dialogue State Tracking Yu-ping Ruan, Zhen-hua Ling, Jia-chen Gu, Quan Liu
- GREEK-BERT: The Greeks Visiting Sesame Street John Koutsikakis, Ilias Chalkidis, Prodromos Malakasiotis, Ion Androutsopoulos
- The Effect Of Natural Distribution Shift On Question Answering Models John Miller, Karl Krauth, Benjamin Recht, Ludwig Schmidt
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- Measuring And Reducing Gendered Correlations In Pre-trained Models Kellie Webster et al.
- Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training Weizhen Qi et al.
- Machine Reading Comprehension: The Role Of Contextualized Language Models And Beyond Zhuosheng Zhang, Hai Zhao, Rui Wang
- Encoding Syntactic Knowledge In Transformer Encoder For Intent Detection And Slot Filling Jixuan Wang, Kai Wei, Martin Radfar, Weiwei Zhang, Clement Chung
- Question And Answer Test-train Overlap In Open-domain Question Answering Datasets Patrick Lewis, Pontus Stenetorp, Sebastian Riedel
- Mixkd: Towards Efficient Distillation Of Large-scale Language Models Kevin J Liang et al.
- Look Before You Speak: Visually Contextualized Utterances Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid
- Schema-guided Dialogue State Tracking Task At DSTC8 Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta, Pranav Khaitan
- Asking Questions The Human Way: Scalable Question-answer Generation From Text Corpus Bang Liu, Haojie Wei, Di Niu, Haolan Chen, Yancheng He
- Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue Shikib Mehri, Mihail Eric, Dilek Hakkani-tur
- XTREME: A Massively Multilingual Multi-task Benchmark For Evaluating Cross-lingual Generalization Junjie Hu et al.
- Narrative Interpolation For Generating And Understanding Stories Su Wang, Greg Durrett, Katrin Erk
- Leveraging Passage Retrieval With Generative Models For Open Domain Question Answering Gautier Izacard, Edouard Grave
- Dialoguetrm: Exploring The Intra- And Inter-modal Emotional Behaviors In The Conversation Yuzhao Mao et al.
- Pchatbot: A Large-scale Dataset For Personalized Chatbot Hongjin Qian et al.
- Language Models Are Few-shot Learners Tom B. Brown et al.
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- Incorporating External Knowledge Through Pre-training For Natural Language To Code Generation Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig
- X-FACTR: Multilingual Factual Knowledge Retrieval From Pretrained Language Models Zhengbao Jiang, Antonios Anastasopoulos, Jun Araki, Haibo Ding, Graham Neubig
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- Continual Learning For Natural Language Generation In Task-oriented Dialog Systems Fei Mi, Liangwei Chen, Mengjie Zhao, Minlie Huang, Boi Faltings
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- Trading Off Diversity And Quality In Natural Language Generation Hugh Zhang, Daniel Duckworth, Daphne Ippolito, Arvind Neelakantan
- Template Guided Text Generation For Task-oriented Dialogue Mihir Kale, Abhinav Rastogi
- Variational Transformers For Diverse Response Generation Zhaojiang Lin, Genta Indra Winata, Peng Xu, Zihan Liu, Pascale Fung
- Collaborative Storytelling With Large-scale Neural Language Models Eric Nichols, Leo Gao, Randy Gomez
- CERT: Contrastive Self-supervised Learning For Language Understanding Hongchao Fang, Sicheng Wang, Meng Zhou, Jiayuan Ding, Pengtao Xie
- Multilingual Translation With Extensible Multilingual Pretraining And Finetuning Yuqing Tang et al.
- On Learning Universal Representations Across Languages Xiangpeng Wei et al.
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- Residual Energy-based Models For Text Generation Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'aurelio Ranzato
- Data Manipulation: Towards Effective Instance Learning For Neural Dialogue Generation Via Learning To Augment And Reweight Hengyi Cai et al.
- Very Deep Transformers For Neural Machine Translation Xiaodong Liu, Kevin Duh, Liyuan Liu, Jianfeng Gao
- Mixup-transformer: Dynamic Data Augmentation For NLP Tasks Lichao Sun et al.
- Mobilebert: A Compact Task-agnostic BERT For Resource-limited Devices Zhiqing Sun et al.
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- Constructing A Multi-hop QA Dataset For Comprehensive Evaluation Of Reasoning Steps Xanh Ho, Anh-khoa Duong Nguyen, Saku Sugawara, Akiko Aizawa
- Indic-transformers: An Analysis Of Transformer Language Models For Indian Languages Kushal Jain, Adwait Deshpande, Kumar Shridhar, Felix Laumann, Ayushman Dash
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Ruibo Liu et al.
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- BERT Based Multilingual Machine Comprehension In English And Hindi Somil Gupta, Nilesh Khade
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- Delight: Deep And Light-weight Transformer Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- Unsupervised Evaluation Of Interactive Dialog With Dialogpt Shikib Mehri, Maxine Eskenazi
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Mt5: A Massively Multilingual Pre-trained Text-to-text Transformer Linting Xue et al.
- Data Augmentation Using Pre-trained Transformer Models Varun Kumar, Ashutosh Choudhary, Eunah Cho
- The Pile: An 800GB Dataset Of Diverse Text For Language Modeling Leo Gao et al.
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- Increasing Faithfulness In Knowledge-grounded Dialogue With Controllable Features Hannah Rashkin, David Reitter, Gaurav Singh Tomar, Dipanjan Das
- Few-shot Learning With Multilingual Language Models Xi Victoria Lin et al.
- Truthfulqa: Measuring How Models Mimic Human Falsehoods Stephanie Lin, Jacob Hilton, Owain Evans
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- G-transformer For Document-level Machine Translation Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo
- MAUVE: Measuring The Gap Between Neural Text And Human Text Using Divergence Frontiers Krishna Pillutla et al.
- Language Models Are Few-shot Multilingual Learners Genta Indra Winata et al.
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- Improving Stack Overflow Question Title Generation With Copying Enhanced Codebert Model And Bi-modal Information Fengji Zhang et al.
- Bartscore: Evaluating Generated Text As Text Generation Weizhe Yuan, Graham Neubig, Pengfei Liu
- Progressive Transformer-based Generation Of Radiology Reports Farhad Nooralahzadeh, Nicolas Perez Gonzalez, Thomas Frauenfelder, Koji Fujimoto, Michael Krauthammer
- Mention Memory: Incorporating Textual Knowledge Into Transformers Through Entity Mention Attention Michiel De Jong, Yury Zemlyanskiy, Nicholas Fitzgerald, Fei Sha, William Cohen
- Crossing The Conversational Chasm: A Primer On Natural Language Processing For Multilingual Task-oriented Dialogue Systems Evgeniia Razumovskaia et al.
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- Evaluating The Robustness Of Neural Language Models To Input Perturbations Milad Moradi, Matthias Samwald
- Multimodal Dialogue Response Generation Qingfeng Sun et al.
- Explaining Documents' Relevance To Search Queries Razieh Rahimi, Youngwoo Kim, Hamed Zamani, James Allan
- MWP-BERT: Numeracy-augmented Pre-training For Math Word Problem Solving Zhenwen Liang et al.
- Bob: BERT Over BERT For Training Persona-based Dialogue Models From Limited Personalized Data Haoyu Song, Yan Wang, Kaiyan Zhang, Wei-nan Zhang, Ting Liu
- Retrieval Augmentation Reduces Hallucination In Conversation Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, Jason Weston
- EVA: An Open-domain Chinese Dialogue System With Large-scale Generative Pre-training Hao Zhou et al.
- On The Safety Of Conversational Models: Taxonomy, Dataset, And Benchmark Hao Sun et al.
- Retrieval Augmented Code Generation And Summarization Md Rizwan Parvez, Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-wei Chang
- Prompt Programming For Large Language Models: Beyond The Few-shot Paradigm Laria Reynolds, Kyle Mcdonell
- Efficient Passage Retrieval With Hashing For Open-domain Question Answering Ikuya Yamada, Akari Asai, Hannaneh Hajishirzi
- Focused Attention Improves Document-grounded Generation Shrimai Prabhumoye, Kazuma Hashimoto, Yingbo Zhou, Alan W Black, Ruslan Salakhutdinov
- Multitask Prompted Training Enables Zero-shot Task Generalization Victor Sanh et al.
- Entailment As Few-shot Learner Sinong Wang, Han Fang, Madian Khabsa, Hanzi Mao, Hao Ma
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Mitigating Political Bias In Language Models Through Reinforced Calibration Ruibo Liu et al.
- Process For Adapting Language Models To Society (PALMS) With Values-targeted Datasets Irene Openai Solaiman, Christy Openai Dennison
- Bitod: A Bilingual Multi-domain Dataset For Task-oriented Dialogue Modeling Zhaojiang Lin et al.
- Revisiting The Primacy Of English In Zero-shot Cross-lingual Transfer Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-wei Chang, Kristina Toutanova
- Unipelt: A Unified Framework For Parameter-efficient Language Model Tuning Yuning Mao et al.
- All That's 'human' Is Not Gold: Evaluating Human Evaluation Of Generated Text Elizabeth Clark et al.
- Arat5: Text-to-text Transformers For Arabic Language Generation El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-mageed
- Controllable Generation From Pre-trained Language Models Via Inverse Prompting Xu Zou et al.
- Text Compression-aided Transformer Encoding Zuchao Li et al.
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- End-to-end Training Of Multi-document Reader And Retriever For Open-domain Question Answering Devendra Singh Sachan, Siva Reddy, William Hamilton, Chris Dyer, Dani Yogatama
- Improving And Simplifying Pattern Exploiting Training Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, Colin Raffel
- UC2: Universal Cross-lingual Cross-modal Vision-and-language Pre-training Mingyang Zhou et al.
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- Automated Quality Assessment Of Cognitive Behavioral Therapy Sessions Through Highly Contextualized Language Representations Nikolaos Flemotomos et al.
- True Few-shot Learning With Prompts -- A Real-world Perspective Timo Schick, Hinrich Schütze
- The Stability-efficiency Dilemma: Investigating Sequence Length Warmup For Training GPT Models Conglong Li, Minjia Zhang, Yuxiong He
- MAGMA -- Multimodal Augmentation Of Generative Models Through Adapter-based Finetuning Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
- Language Model Evaluation Beyond Perplexity Clara Meister, Ryan Cotterell
- Meta-learning Via Language Model In-context Tuning Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Can Generative Pre-trained Language Models Serve As Knowledge Bases For Closed-book QA? Cunxiang Wang, Pai Liu, Yue Zhang
- Multimodal Transformer With Variable-length Memory For Vision-and-language Navigation Chuang Lin et al.
- Dialogue State Tracking With A Language Model Using Schema-driven Prompting Chia-hsuan Lee, Hao Cheng, Mari Ostendorf
- A Token-level Reference-free Hallucination Detection Benchmark For Free-form Text Generation Tianyu Liu et al.
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- Prompting Visual-language Models For Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
- RAFT: A Real-world Few-shot Text Classification Benchmark Neel Alex et al.
- Simvlm: Simple Visual Language Model Pretraining With Weak Supervision Zirui Wang et al.
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- Adversarial GLUE: A Multi-task Benchmark For Robustness Evaluation Of Language Models Boxin Wang et al.
- Dynaboard: An Evaluation-as-a-service Platform For Holistic Next-generation Benchmarking Zhiyi Ma et al.
- UNICORN On RAINBOW: A Universal Commonsense Reasoning Model On A New Multitask Benchmark Nicholas Lourie, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners Ningyu Zhang et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- Neural Path Hunter: Reducing Hallucination In Dialogue Systems Via Path Grounding Nouha Dziri, Andrea Madotto, Osmar Zaiane, Avishek Joey Bose
- Maria: Spanish Language Models Asier Gutiérrez-fandiño et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- HTLM: Hyper-text Pre-training And Prompting Of Language Models Armen Aghajanyan et al.
- Multilingual LAMA: Investigating Knowledge In Multilingual Pretrained Language Models Nora Kassner, Philipp Dufter, Hinrich Schütze
- Hierarchical Task Learning From Language Instructions With Unified Transformers And Self-monitoring Yichi Zhang, Joyce Chai
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- Human Parity On Commonsenseqa: Augmenting Self-attention With External Attention Yichong Xu et al.
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- Predicting The Performance Of Multilingual NLP Models Anirudh Srinivasan et al.
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- Wordcraft: A Human-ai Collaborative Editor For Story Writing Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- Few-shot Bot: Prompt-based Learning For Dialogue Systems Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung
- Revisiting Self-training For Few-shot Learning Of Language Model Yiming Chen et al.
- \(Q^{2}\): Evaluating Factual Consistency In Knowledge-grounded Dialogues Via Question Generation And Question Answering Or Honovich et al.
- A General Language Assistant As A Laboratory For Alignment Amanda Askell et al.
- Few-shot Question Answering By Pretraining Span Selection Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy
- An Empirical Study Of GPT-3 For Few-shot Knowledge-based VQA Zhengyuan Yang et al.
- Task-oriented Dialogue System As Natural Language Generation Weizhi Wang et al.
- Webqa: Multihop And Multimodal QA Yingshan Chang et al.
- Episodic Transformer For Vision-and-language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun
- Dexperts: Decoding-time Controlled Text Generation With Experts And Anti-experts Alisa Liu et al.
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- Multi-task Pre-training For Plug-and-play Task-oriented Dialogue System Yixuan Su et al.
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- MT6: Multilingual Pretrained Text-to-text Transformer With Translation Pairs Zewen Chi et al.
- One Question Answering Model For Many Languages With Cross-lingual Dense Passage Retrieval Akari Asai, Xinyan Yu, Jungo Kasai, Hannaneh Hajishirzi
- Spot: Better Frozen Model Adaptation Through Soft Prompt Transfer Tu Vu, Brian Lester, Noah Constant, Rami Al-rfou, Daniel Cer
- Quiz-style Question Generation For News Stories Adam D. Lelkes, Vinh Q. Tran, Cong Yu
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Embodied BERT: A Transformer Model For Embodied, Language-guided Visual Task Completion Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme
- Visqa: X-raying Vision And Language Reasoning In Transformers Theo Jaunet et al.
- FLEX: Unifying Evaluation For Few-shot NLP Jonathan Bragg, Arman Cohan, Kyle Lo, Iz Beltagy
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Challenges In Detoxifying Language Models Johannes Welbl et al.
- Towards Continual Knowledge Learning Of Language Models Joel Jang et al.
- A Simple Recipe For Multilingual Grammatical Error Correction Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause, Aliaksei Severyn
- Rome Was Built In 1776: A Case Study On Factual Correctness In Knowledge-grounded Response Generation Sashank Santhanam et al.
- Beyond Goldfish Memory: Long-term Open-domain Conversation Jing Xu, Arthur Szlam, Jason Weston
- SIMMC 2.0: A Task-oriented Dialog Dataset For Immersive Multimodal Conversations Satwik Kottur, Seungwhan Moon, Alborz Geramifard, Babak Damavandi
- XTREME-R: Towards More Challenging And Nuanced Multilingual Evaluation Sebastian Ruder et al.
- Sentence-t5: Scalable Sentence Encoders From Pre-trained Text-to-text Models Jianmo Ni et al.
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Improving Question Answering Model Robustness With Synthetic Adversarial Data Generation Max Bartolo et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- Codexglue: A Machine Learning Benchmark Dataset For Code Understanding And Generation Shuai Lu et al.
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- Multimodal Few-shot Learning With Frozen Language Models Maria Tsimpoukelli et al.
- What Makes Good In-context Examples For GPT-\(3\)? Jiachang Liu et al.
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Code Structure Guided Transformer For Source Code Summarization Shuzheng Gao et al.
- Recursively Summarizing Books With Human Feedback Jeff Wu et al.
- Evaluating Large Language Models Trained On Code Mark Chen et al.
- Compacter: Efficient Low-rank Hypercomplex Adapter Layers Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder
- Unitab: Unifying Text And Box Outputs For Grounded Vision-language Modeling Zhengyuan Yang et al.
- GALAXY: A Generative Pre-trained Model For Task-oriented Dialog With Semi-supervised Learning And Explicit Policy Injection Wanwei He et al.
- Variational Information Bottleneck For Effective Low-resource Fine-tuning Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Unifying Vision-and-language Tasks Via Text Generation Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal
- Program Synthesis With Large Language Models Jacob Austin et al.
- Generated Knowledge Prompting For Commonsense Reasoning Jiacheng Liu et al.
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty In Text Generation Using RAVEN R. Thomas Mccoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz
- Webgpt: Browser-assisted Question-answering With Human Feedback Reiichiro Nakano et al.
- Redditbias: A Real-world Resource For Bias Evaluation And Debiasing Of Conversational Language Models Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
- Hurdles To Progress In Long-form Question Answering Kalpesh Krishna, Aurko Roy, Mohit Iyyer
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Few-shot Knowledge Graph-to-text Generation With Pretrained Language Models Junyi Li et al.
- Indonlg: Benchmark And Resources For Evaluating Indonesian Natural Language Generation Samuel Cahyawijaya et al.
- Robertuito: A Pre-trained Language Model For Social Media Text In Spanish Juan Manuel Pérez, Damián A. Furman, Laura Alonso Alemany, Franco Luque
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- Raise A Child In Large Language Model: Towards Effective And Generalizable Fine-tuning Runxin Xu et al.
- Diverse Demonstrations Improve In-context Compositional Generalization Itay Levy, Ben Bogin, Jonathan Berant
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Evaluating Mixed-initiative Conversational Search Systems Via User Simulation Ivan Sekulić, Mohammad Aliannejadi, Fabio Crestani
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- Webshop: Towards Scalable Real-world Web Interaction With Grounded Language Agents Shunyu Yao, Howard Chen, John Yang, Karthik Narasimhan
- Revisiting Neural Scaling Laws In Language And Vision Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
- Scaling Instruction-finetuned Language Models Hyung Won Chung et al.
- Reacc: A Retrieval-augmented Code Completion Framework Shuai Lu et al.
- Coderl: Mastering Code Generation Through Pretrained Models And Deep Reinforcement Learning Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi
- Interactive Code Generation Via Test-driven User-intent Formalization Shuvendu K. Lahiri et al.
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- CLIP-TD: CLIP Targeted Distillation For Vision-language Tasks Zhecan Wang et al.
- Biobart: Pretraining And Evaluation Of A Biomedical Generative Language Model Hongyi Yuan et al.
- One Embedder, Any Task: Instruction-finetuned Text Embeddings Hongjin Su et al.
- On The Paradox Of Learning To Reason From Data Honghua Zhang, Liunian Harold Li, Tao Meng, Kai-wei Chang, Guy Van Den Broeck
- Language Models As Zero-shot Planners: Extracting Actionable Knowledge For Embodied Agents Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch
- EVA2.0: Investigating Open-domain Chinese Dialogue Systems With Large-scale Pre-training Yuxian Gu et al.
- Few-shot Parameter-efficient Fine-tuning Is Better And Cheaper Than In-context Learning Haokun Liu et al.
- Cogvideo: Large-scale Pretraining For Text-to-video Generation Via Transformers Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang
- Contrastive Learning With Bidirectional Transformers For Sequential Recommendation Hanwen Du et al.
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- Revisiting Parameter-efficient Tuning: Are We Really There Yet? Guanzheng Chen, Fangyu Liu, Zaiqiao Meng, Shangsong Liang
- Evaluating And Inducing Personality In Pre-trained Language Models Guangyuan Jiang et al.
- Dialfred: Dialogue-enabled Agents For Embodied Instruction Following Xiaofeng Gao et al.
- Atlas: Few-shot Learning With Retrieval Augmented Language Models Gautier Izacard et al.
- Contrastive Decoding: Open-ended Text Generation As Optimization Xiang Lisa Li et al.
- Language Models Are Multilingual Chain-of-thought Reasoners Freda Shi et al.
- Planbench: An Extensible Benchmark For Evaluating Large Language Models On Planning And Reasoning About Change Karthik Valmeekam, Matthew Marquez, Alberto Olmo, Sarath Sreedharan, Subbarao Kambhampati
- Large Language Models Encode Clinical Knowledge Karan Singhal et al.
- Minicons: Enabling Flexible Behavioral And Representational Analyses Of Transformer Language Models Kanishka Misra
- VLC-BERT: Visual Question Answering With Contextualized Commonsense Knowledge Sahithya Ravi, Aditya Chinchure, Leonid Sigal, Renjie Liao, Vered Shwartz
- Training Compute-optimal Large Language Models Jordan Hoffmann et al.
- OPT-IML: Scaling Language Model Instruction Meta Learning Through The Lens Of Generalization Srinivasan Iyer et al.
- Towards Trustworthy Autograding Of Short, Multi-lingual, Multi-type Answers Johannes Schneider, Robin Richner, Micha Riser
- Vision-language Pre-training With Triple Contrastive Learning Jinyu Yang et al.
- News Summarization And Evaluation In The Era Of GPT-3 Tanya Goyal, Junyi Jessy Li, Greg Durrett
- Language Models (mostly) Know What They Know Saurav Kadavath et al.
- Instruction Tuning For Few-shot Aspect-based Sentiment Analysis Siddharth Varia et al.
- RASAT: Integrating Relational Structures Into Pretrained Seq2seq Model For Text-to-sql Jiexing Qi et al.
- Towards Reasoning In Large Language Models: A Survey Jie Huang, Kevin Chen-chuan Chang
- Unified-io: A Unified Model For Vision, Language, And Multi-modal Tasks Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
- Lilt: A Simple Yet Effective Language-independent Layout Transformer For Structured Document Understanding Jiapeng Wang, Lianwen Jin, Kai Ding
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Diffuseq: Sequence To Sequence Text Generation With Diffusion Models Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- Zerogen: Efficient Zero-shot Learning Via Dataset Generation Jiacheng Ye et al.
- Gtrans: Grouping And Fusing Transformer Layers For Neural Machine Translation Jian Yang et al.
- Adapting Pre-trained Language Models To African Languages Via Multilingual Adaptive Fine-tuning Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow
- React: Synergizing Reasoning And Acting In Language Models Shunyu Yao et al.
- Revisiting The "video" In Video-language Understanding Shyamal Buch et al.
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- Flamingo: A Visual Language Model For Few-shot Learning Jean-baptiste Alayrac et al.
- Chain-of-thought Prompting Elicits Reasoning In Large Language Models Jason Wei et al.
- Benchmarking Large Language Models For Automated Verilog RTL Code Generation Shailja Thakur et al.
- Maieutic Prompting: Logically Consistent Reasoning With Recursive Explanations Jaehun Jung et al.
- Dall-eval: Probing The Reasoning Skills And Social Biases Of Text-to-image Generation Models Jaemin Cho, Abhay Zala, Mohit Bansal
- Teaching Language Models To Support Answers With Verified Quotes Jacob Menick et al.
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- PAL: Program-aided Language Models Luyu Gao et al.
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Prompting Is Programming: A Query Language For Large Language Models Luca Beurer-kellner, Marc Fischer, Martin Vechev
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Vit5: Pretrained Text-to-text Transformer For Vietnamese Language Generation Long Phan, Hieu Tran, Hieu Nguyen, Trieu H. Trinh
- In-context Examples Selection For Machine Translation Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, Marjan Ghazvininejad
- Real Or Fake Text?: Investigating Human Ability To Detect Boundaries Between Human-written And Machine-generated Text Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Sherry Shi, Chris Callison-burch
- The Goldilocks Of Pragmatic Understanding: Fine-tuning Strategy Matters For Implicature Resolution By Llms Laura Ruis et al.
- Blenderbot 3: A Deployed Conversational Agent That Continually Learns To Responsibly Engage Kurt Shuster et al.
- Is Reinforcement Learning (not) For Natural Language Processing: Benchmarks, Baselines, And Building Blocks For Natural Language Policy Optimization Rajkumar Ramamurthy et al.
- Efficient Long-text Understanding With Short-text Models Maor Ivgi, Uri Shaham, Jonathan Berant
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- Can Large Language Models Reason About Medical Questions? Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther
- Towards Using Few-shot Prompt Learning For Automating Model Completion Meriem Ben Chaaben, Lola Burgueño, Houari Sahraoui
- When And Why Vision-language Models Behave Like Bags-of-words, And What To Do About It? Mert Yuksekgonul, Federico Bianchi, Pratyusha Kalluri, Dan Jurafsky, James Zou
- Self-conditioned Embedding Diffusion For Text Generation Robin Strudel et al.
- A Systematic Evaluation Of Large Language Models Of Code Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Deplot: One-shot Visual Language Reasoning By Plot-to-table Translation Fangyu Liu et al.
- Ernie-search: Bridging Cross-encoder With Dual-encoder Via Self On-the-fly Distillation For Dense Passage Retrieval Yuxiang Lu et al.
- Matcha: Enhancing Visual Language Pretraining With Math Reasoning And Chart Derendering Fangyu Liu et al.
- Greaselm: Graph Reasoning Enhanced Language Models For Question Answering Xikun Zhang et al.
- Codegen: An Open Large Language Model For Code With Multi-turn Program Synthesis Erik Nijkamp et al.
- Memory-based Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- IGLUE: A Benchmark For Transfer Learning Across Modalities, Tasks, And Languages Emanuele Bugliarello et al.
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- CREPE: Can Vision-language Foundation Models Reason Compositionally? Zixian Ma et al.
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- Hyperprompt: Prompt-based Task-conditioning Of Transformers Yun He et al.
- Self-adaptive In-context Learning: An Information Compression Perspective For In-context Example Selection And Ordering Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong
- Altclip: Altering The Language Encoder In CLIP For Extended Language Capabilities Zhongzhi Chen et al.
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- The Stack: 3 TB Of Permissively Licensed Source Code Denis Kocetkov et al.
- Least-to-most Prompting Enables Complex Reasoning In Large Language Models Denny Zhou et al.
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- Future Transformer For Long-term Action Anticipation Dayoung Gong, Joonseok Lee, Manjin Kim, Seong Jong Ha, Minsu Cho
- Factpegasus: Factuality-aware Pre-training And Fine-tuning For Abstractive Summarization David Wan, Mohit Bansal
- Prompting Palm For Translation: Assessing Strategies And Performance David Vilar et al.
- Adaprompt: Adaptive Model Training For Prompt-based NLP Yulong Chen et al.
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Incoder: A Generative Model For Code Infilling And Synthesis Daniel Fried et al.
- Democratizing Contrastive Language-image Pre-training: A CLIP Benchmark Of Data, Model, And Supervision Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao
- Unified Vision And Language Prompt Learning Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
- Learning Video Representations From Large Language Models Yue Zhao, Ishan Misra, Philipp Krähenbühl, Rohit Girdhar
- Augesc: Dialogue Augmentation With Large Language Models For Emotional Support Conversation Chujie Zheng, Sahand Sabour, Jiaxin Wen, Zheng Zhang, Minlie Huang
- Competition-level Code Generation With Alphacode Yujia Li et al.
- Promda: Prompt-based Data Augmentation For Low-resource NLU Tasks Yufei Wang et al.
- Prompt For Extraction? PAIE: Prompting Argument Interaction For Event Argument Extraction Yubo Ma et al.
- Cont: Contrastive Neural Text Generation Chenxin An et al.
- Linearly Mapping From Image To Text Space Jack Merullo, Louis Castricato, Carsten Eickhoff, Ellie Pavlick
- Noisytune: A Little Noise Can Help You Finetune Pretrained Language Models Better Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang, Xing Xie
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- A Unified End-to-end Retriever-reader Framework For Knowledge-based VQA Yangyang Guo et al.
- A Unified Multi-task Learning Framework For Multi-goal Conversational Recommender Systems Yang Deng et al.
- Self-consistency Improves Chain Of Thought Reasoning In Language Models Xuezhi Wang et al.
- Texts As Images In Prompt Tuning For Multi-label Image Recognition Zixian Guo et al.
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Complexity-based Prompting For Multi-step Reasoning Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, Tushar Khot
- A Survey On Model Compression And Acceleration For Pretrained Language Models Canwen Xu, Julian Mcauley
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- Survey Of Hallucination In Natural Language Generation Ziwei Ji et al.
- Multimodal Knowledge Alignment With Reinforcement Learning Youngjae Yu et al.
- Super-naturalinstructions: Generalization Via Declarative Instructions On 1600+ NLP Tasks Yizhong Wang et al.
- Impact Of Pretraining Term Frequencies On Few-shot Reasoning Yasaman Razeghi, Robert L. Iv Logan, Matt Gardner, Sameer Singh
- Analogy Generation By Prompting Large Language Models: A Case Study Of Instructgpt Bhavya Bhavya, Jinjun Xiong, Chengxiang Zhai
- BLOOM: A 176b-parameter Open-access Multilingual Language Model Bigscience Workshop et al.
- Memorizing Transformers Yuhuai Wu, Markus N. Rabe, Delesley Hutchins, Christian Szegedy
- Attributed Question Answering: Evaluation And Modeling For Attributed Large Language Models Bernd Bohnet et al.
- Multi-lingual Evaluation Of Code Generation Models Ben Athiwaratkun et al.
- Codet: Code Generation With Generated Tests Bei Chen et al.
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- Retrieval Augmentation Of Large Language Models For Lay Language Generation Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- Automatic Chain Of Thought Prompting In Large Language Models Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola
- Generative Language Models For Paragraph-level Question Generation Asahi Ushio, Fernando Alva-manchego, Jose Camacho-collados
- T-NER: An All-round Python Library For Transformer-based Named Entity Recognition Asahi Ushio, Jose Camacho-collados
- Dialog Inpainting: Turning Documents Into Dialogs Zhuyun Dai et al.
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Selection-inference: Exploiting Large Language Models For Interpretable Logical Reasoning Antonia Creswell, Murray Shanahan, Irina Higgins
- Making Large Language Models Better Reasoners With Step-aware Verifier Yifei Li et al.
- TIARA: Multi-grained Retrieval For Robust Question Answering Over Large Knowledge Bases Yiheng Shu et al.
- The AI Teacher Test: Measuring The Pedagogical Ability Of Blender And GPT-3 In Educational Dialogues Anaïs Tack, Chris Piech
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Language Model Compression With Weighted Low-rank Factorization Yen-chang Hsu et al.
- Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier
- Commonsenseqa 2.0: Exposing The Limits Of AI Through Gamification Alon Talmor et al.
- WANLI: Worker And AI Collaboration For Natural Language Inference Dataset Creation Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- Dualprompt: Complementary Prompting For Rehearsal-free Continual Learning Zifeng Wang et al.
- LASP: Text-to-text Optimization For Language-aware Soft Prompting Of Vision & Language Models Adrian Bulat, Georgios Tzimiropoulos
- Language Models Are Greedy Reasoners: A Systematic Formal Analysis Of Chain-of-thought Abulhair Saparov, He He
- Scaling Up Models And Data With \(\texttt{t5x}\) And \(\texttt{seqio}\) Adam Roberts et al.
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models Aarohi Shammie Srivastava et al.
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- Solving Quantitative Reasoning Problems With Language Models Aitor Lewkowycz et al.
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- Unnatural Instructions: Tuning Language Models With (almost) No Human Labor Or Honovich, Thomas Scialom, Omer Levy, Timo Schick
- On Second Thought, Let's Not Think Step By Step! Bias And Toxicity In Zero-shot Reasoning Omar Shaikh, Hongxin Zhang, William Held, Michael Bernstein, Diyi Yang
- ROSCOE: A Suite Of Metrics For Scoring Step-by-step Reasoning Olga Golovneva et al.
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- On The Origin Of Hallucinations In Conversational Models: Is It The Datasets Or The Models? Nouha Dziri, Sivan Milton, Mo Yu, Osmar Zaiane, Siva Reddy
- Grounding Language With Visual Affordances Over Unstructured Data Oier Mees, Jessica Borja-diaz, Wolfram Burgard
- Faithdial: A Faithful Benchmark For Information-seeking Dialogue Nouha Dziri et al.
- "this Is My Unicorn, Fluffy": Personalizing Frozen Vision-language Representations Niv Cohen, Rinon Gal, Eli A. Meirom, Gal Chechik, Yuval Atzmon
- No Language Left Behind: Scaling Human-centered Machine Translation Nllb Team et al.
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- Learning To Compose Soft Prompts For Compositional Zero-shot Learning Nihal V. Nayak, Peilin Yu, Stephen H. Bach
- Unifiedskg: Unifying And Multi-tasking Structured Knowledge Grounding With Text-to-text Language Models Tianbao Xie et al.
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Vl-checklist: Evaluating Pre-trained Vision-language Models With Objects, Attributes And Relations Tiancheng Zhao et al.
- Contrastive Learning Reduces Hallucination In Conversations Weiwei Sun et al.
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Fine-tuned Language Models Are Continual Learners Thomas Scialom, Tuhin Chakrabarty, Smaranda Muresan
- Toxigen: A Large-scale Machine-generated Dataset For Adversarial And Implicit Hate Speech Detection Thomas Hartvigsen et al.
- Efficient Training Of Language Models To Fill In The Middle Mohammad Bavarian et al.
- Dylora: Parameter Efficient Tuning Of Pre-trained Models Using Dynamic Search-free Low-rank Adaptation Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, Ali Ghodsi
- Challenging Big-bench Tasks And Whether Chain-of-thought Can Solve Them Mirac Suzgun et al.
- Towards A Unified Multi-dimensional Evaluator For Text Generation Ming Zhong et al.
- Evaluating Human-language Model Interaction Mina Lee et al.
- An Empirical Study Of End-to-end Video-language Transformers With Masked Visual Modeling Tsu-jui Fu et al.
- Training And Evaluating A Jupyter Notebook Data Science Assistant Shubham Chandel, Colin B. Clement, Guillermo Serrato, Neel Sundaresan
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Instructdial: Improving Zero And Few-shot Generalization In Dialogue Through Instruction Tuning Prakhar Gupta et al.
- Co-writing Screenplays And Theatre Scripts With Language Models: An Evaluation By Industry Professionals Piotr Mirowski, Kory W. Mathewson, Jaylen Pittman, Richard Evans
- Conversational Question Answering On Heterogeneous Sources Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum
- Holistic Evaluation Of Language Models Percy Liang et al.
- PINTO: Faithful Language Reasoning Using Prompt-generated Rationales Peifeng Wang, Aaron Chan, Filip Ilievski, Muhao Chen, Xiang Ren
- Can Llms Express Their Uncertainty? An Empirical Evaluation Of Confidence Elicitation In Llms Miao Xiong et al.
- A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets Md Tahmid Rahman Laskar et al.
- Gptaraeval: A Comprehensive Evaluation Of Chatgpt On Arabic NLP Md Tawkat Islam Khondaker, Abdul Waheed, El Moatez Billah Nagoudi, Muhammad Abdul-mageed
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- Internvl: Scaling Up Vision Foundation Models And Aligning For Generic Visual-linguistic Tasks Zhe Chen et al.
- Large Language Models Effectively Leverage Document-level Context For Literary Translation, But Critical Errors Persist Marzena Karpinska, Mohit Iyyer
- Few-shot Fine-tuning Vs. In-context Learning: A Fair Comparison And Evaluation Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar
- Applenet: Visual Attention Parameterized Prompt Learning For Few-shot Remote Sensing Image Generalization Using CLIP Mainak Singha, Ankit Jha, Bhupendra Solanki, Shirsha Bose, Biplab Banerjee
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Llm-grounded Diffusion: Enhancing Prompt Understanding Of Text-to-image Diffusion Models With Large Language Models Long Lian, Boyi Li, Adam Yala, Trevor Darrell
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Taiyi: A Bilingual Fine-tuned Large Language Model For Diverse Biomedical Tasks Ling Luo et al.
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- Do Llms Exhibit Human-like Response Biases? A Case Study In Survey Design Lindia Tjuatja, Valerie Chen, Sherry Tongshuang Wu, Ameet Talwalkar, Graham Neubig
- Reasoning On Graphs: Faithful And Interpretable Large Language Model Reasoning Linhao Luo, Yuan-fang Li, Gholamreza Haffari, Shirui Pan
- Judging Llm-as-a-judge With Mt-bench And Chatbot Arena Lianmin Zheng et al.
- Can Chatgpt Replace Stackoverflow? A Study On Robustness And Reliability Of Large Language Model Code Generation Li Zhong, Zilong Wang
- Improving Text Embeddings With Large Language Models Liang Wang et al.
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- A Survey On Hallucination In Large Language Models: Principles, Taxonomy, Challenges, And Open Questions Lei Huang et al.
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Superclue: A Comprehensive Chinese Large Language Model Benchmark Liang Xu et al.
- News Verifiers Showdown: A Comparative Performance Evaluation Of Chatgpt 3.5, Chatgpt 4.0, Bing AI, And Bard In News Fact-checking Kevin Matthe Caramancion
- Inference-time Intervention: Eliciting Truthful Answers From A Language Model Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg
- Just Ask For Calibration: Strategies For Eliciting Calibrated Confidence Scores From Language Models Fine-tuned With Human Feedback Katherine Tian et al.
- Evaluating Language Models For Mathematics Through Interactions Katherine M. Collins et al.
- Geochat: Grounded Large Vision-language Model For Remote Sensing Kartik Kuckreja et al.
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Topical-chat: Towards Knowledge-grounded Open-domain Conversations Karthik Gopalakrishnan et al.
- Mvp: Multi-view Prompting Improves Aspect Sentiment Tuple Prediction Zhibin Gou, Qingyan Guo, Yujiu Yang
- Towards Expert-level Medical Question Answering With Large Language Models Karan Singhal et al.
- Chipgpt: How Far Are We From Natural Language Hardware Design Kaiyan Chang et al.
- Evaluation And Analysis Of Hallucination In Large Vision-language Models Junyang Wang et al.
- Recommendation As Instruction Following: A Large Language Model Empowered Recommendation Approach Junjie Zhang et al.
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, Chun Zuo
- Evaluating GPT-4 And Chatgpt On Japanese Medical Licensing Examinations Jungo Kasai, Yuhei Kasai, Keisuke Sakaguchi, Yutaro Yamada, Dragomir Radev
- Chatcounselor: A Large Language Models For Mental Health Support June M. Liu et al.
- Honeybee: Locality-enhanced Projector For Multimodal LLM Junbum Cha, Wooyoung Kang, Jonghwan Mun, Byungseok Roh
- Breaking The Silence: The Threats Of Using Llms In Software Engineering June Sallou, Thomas Durieux, Annibale Panichella
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- MEGA: Multilingual Evaluation Of Generative AI Kabir Ahuja et al.
- Towards Llm-based Autograding For Short Textual Answers Johannes Schneider, Bernd Schenk, Christina Niklaus
- LEXTREME: A Multi-lingual And Multi-task Benchmark For The Legal Domain Joel Niklaus et al.
- Exploring The Benefits Of Training Expert Language Models Over Instruction Tuning Joel Jang et al.
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- Gptscore: Evaluate As You Desire Jinlan Fu, See-kiong Ng, Zhengbao Jiang, Pengfei Liu
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- Graphix-t5: Mixing Pre-trained Transformers With Graph-aware Layers For Text-to-sql Parsing Jinyang Li et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- Prompt-and-align: Prompt-based Social Alignment For Few-shot Fake News Detection Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
- Benchmarking Large Language Models In Retrieval-augmented Generation Jiawei Chen, Hongyu Lin, Xianpei Han, Le Sun
- Unified-io 2: Scaling Autoregressive Multimodal Models With Vision, Language, Audio, And Action Jiasen Lu et al.
- A Unified Generative Retriever For Knowledge-intensive Language Tasks Via Prompt Learning Jiangui Chen et al.
- Onellm: One Framework To Align All Modalities With Language Jiaming Han et al.
- Llm-grounder: Open-vocabulary 3D Visual Grounding With Large Language Model As An Agent Jianing Yang et al.
- ICL-D3IE: In-context Learning With Diverse Demonstrations Updating For Document Information Extraction Jiabang He et al.
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- VILA: On Pre-training For Visual Language Models Ji Lin et al.
- GPT-3.5, GPT-4, Or BARD? Evaluating Llms Reasoning Ability In Zero-shot Setting And Performance Boosting Through Prompts Jessica López Espejel, El Hassane Ettifouri, Mahaman Sanoussi Yahaya Alassan, El Mehdi Chouham, Walid Dahhane
- AWQ: Activation-aware Weight Quantization For LLM Compression And Acceleration Ji Lin et al.
- Symbol Tuning Improves In-context Learning In Language Models Jerry Wei et al.
- Evaluating Large Language Models On A Highly-specialized Topic, Radiation Oncology Physics Jason Holmes et al.
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- Paperqa: Retrieval-augmented Generative Agent For Scientific Research Jakub Lála et al.
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- Chatgpt To Replace Crowdsourcing Of Paraphrases For Intent Classification: Higher Diversity And Comparable Model Robustness Jan Cegin, Jakub Simko, Peter Brusilovsky
- Simple And Controllable Music Generation Jade Copet et al.
- Fake News In Sheep's Clothing: Robust Fake News Detection Against Llm-empowered Style Attacks Jiaying Wu, Jiafeng Guo, Bryan Hooi
- Evaluation Of Chatgpt On Biomedical Tasks: A Zero-shot Comparison With Fine-tuned Generative Transformers Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- "it's Not Like Jarvis, But It's Pretty Close!" -- Examining Chatgpt's Usage Among Undergraduate Students In Computer Science Ishika Joshi, Ritvik Budhiraja, Harshal D Akolekar, Jagat Sesh Challa, Dhruv Kumar
- A Comprehensive Evaluation Of Large Language Models On Benchmark Biomedical Text Processing Tasks Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- Chainforge: A Visual Toolkit For Prompt Engineering And LLM Hypothesis Testing Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Martin Wattenberg, Elena Glassman
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Muse: Text-to-image Generation Via Masked Generative Transformers Huiwen Chang et al.
- A Comprehensive Overview Of Large Language Models Humza Naveed et al.
- Llama: Open And Efficient Foundation Language Models Hugo Touvron et al.
- Llama 2: Open Foundation And Fine-tuned Chat Models Hugo Touvron et al.
- Semantic Compression With Large Language Models Henry Gilbert, Michael Sandborn, Douglas C. Schmidt, Jesse Spencer-smith, Jules White
- Self-chained Image-language Model For Video Localization And Question Answering Shoubin Yu, Jaemin Cho, Prateek Yadav, Mohit Bansal
- Boosting Theory-of-mind Performance In Large Language Models Via Prompting Shima Rahimi Moghaddam, Christopher J. Honey
- Mixture-of-experts Meets Instruction Tuning:a Winning Combination For Large Language Models Sheng Shen et al.
- Scaling Vision-language Models With Sparse Mixture Of Experts Sheng Shen et al.
- The Flan Collection: Designing Data And Methods For Effective Instruction Tuning Shayne Longpre et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- Factscore: Fine-grained Atomic Evaluation Of Factual Precision In Long Form Text Generation Sewon Min et al.
- On Codex Prompt Engineering For OCL Generation: An Empirical Study Seif Abukhalaf, Mohammad Hamdaqa, Foutse Khomh
- A Comparative Study Of Open-source Large Language Models, GPT-4 And Claude 2: Multiple-choice Test Taking In Nephrology Sean Wu et al.
- Medalign: A Clinician-generated Dataset For Instruction Following With Electronic Medical Records Scott L. Fleming et al.
- How Useful Are Educational Questions Generated By Large Language Models? Sabina Elkins, Ekaterina Kochmar, Jackie C. K. Cheung, Iulian Serban
- Ai-assisted Coding: Experiments With GPT-4 Russell A Poldrack, Thomas Lu, Gašper Beguš
- Are Emergent Abilities Of Large Language Models A Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo
- The Science Of Detecting Llm-generated Texts Ruixiang Tang, Yu-neng Chuang, Xia Hu
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Gpteval: A Survey On Assessments Of Chatgpt And GPT-4 Rui Mao, Guanyi Chen, Xulang Zhang, Frank Guerin, Erik Cambria
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- Palm 2 Technical Report Rohan Anil et al.
- Llm-assisted Content Analysis: Using Large Language Models To Support Deductive Coding Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- Automatic Prompt Optimization With "gradient Descent" And Beam Search Reid Pryzant et al.
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Pro-cap: Leveraging A Frozen Vision-language Model For Hateful Meme Detection Rui Cao et al.
- Prompt-based Distribution Alignment For Unsupervised Domain Adaptation Shuanghao Bai et al.
- Chatgpt As A Factual Inconsistency Evaluator For Text Summarization Zheheng Luo, Qianqian Xie, Sophia Ananiadou
- Generating With Confidence: Uncertainty Quantification For Black-box Large Language Models Zhen Lin, Shubhendu Trivedi, Jimeng Sun
- Sabi\'a: Portuguese Large Language Models Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira
- Can We Trust The Evaluation On Chatgpt? Rachith Aiyappa, Jisun An, Haewoon Kwak, Yong-yeol Ahn
- Evaluation Of Chatgpt-generated Medical Responses: A Systematic Review And Meta-analysis Qiuhong Wei et al.
- Codegeex: A Pre-trained Model For Code Generation With Multilingual Benchmarking On Humaneval-x Qinkai Zheng et al.
- Mplug-owl: Modularization Empowers Large Language Models With Multimodality Qinghao Ye et al.
- Translating Radiology Reports Into Plain Language Using Chatgpt And GPT-4 With Prompt Learning: Promising Results, Limitations, And Potential Qing Lyu et al.
- Faithful Chain-of-thought Reasoning Qing Lyu et al.
- Grounded Text-to-image Synthesis With Attention Refocusing Quynh Phung, Songwei Ge, Jia-bin Huang
- Genegpt: Augmenting Large Language Models With Domain Tools For Improved Access To Biomedical Information Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu
- Medcpt: Contrastive Pre-trained Transformers With Large-scale Pubmed Search Logs For Zero-shot Biomedical Information Retrieval Qiao Jin et al.
- AI Transparency In The Age Of Llms: A Human-centered Research Roadmap Q. Vera Liao, Jennifer Wortman Vaughan
- Large Language Models Sensitivity To The Order Of Options In Multiple-choice Questions Pouya Pezeshkpour, Estevam Hruschka
- Visually-prompted Language Model For Fine-grained Scene Graph Generation In An Open World Qifan Yu et al.
- Starcoder: May The Source Be With You! Raymond Li et al.
- Going Beyond Nouns With Vision & Language Models Using Synthetic Data Paola Cascante-bonilla et al.
- Internlm-xcomposer: A Vision-language Large Model For Advanced Text-image Comprehension And Composition Pan Zhang et al.
- GPT-4 Technical Report Openai et al.
- Fusecap: Leveraging Large Language Models For Enriched Fused Image Captions Noam Rotstein, David Bensaid, Shaked Brody, Roy Ganz, Ron Kimmel
- Large Language Models Are Built-in Autoregressive Search Engines Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- Bridging The Gap: A Survey On Integrating (human) Feedback For Natural Language Generation Patrick Fernandes et al.
- Sources Of Hallucination By Large Language Models On Inference Tasks Nick Mckenna et al.
- Self-contradictory Hallucinations Of Large Language Models: Evaluation, Detection And Mitigation Niels Mündler, Jingxuan He, Slobodan Jenko, Martin Vechev
- Jais And Jais-chat: Arabic-centric Foundation And Instruction-tuned Open Generative Large Language Models Neha Sengupta et al.
- Lost In The Middle: How Language Models Use Long Contexts Nelson F. Liu et al.
- Chatgpt MT: Competitive For High- (but Not Low-) Resource Languages Nathaniel R. Robinson, Perez Ogayo, David R. Mortensen, Graham Neubig
- Self-regulating Prompts: Foundational Model Adaptation Without Forgetting Muhammad Uzair Khattak et al.
- State Of What Art? A Call For Multi-prompt LLM Evaluation Moran Mizrahi et al.
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- DIN-SQL: Decomposed In-context Learning Of Text-to-sql With Self-correction Mohammadreza Pourreza, Davood Rafiei
- Do Llms Understand Social Knowledge? Evaluating The Sociability Of Large Language Models With Socket Benchmark Minje Choi, Jiaxin Pei, Sagar Kumar, Chang Shu, David Jurgens
- Api-bank: A Comprehensive Benchmark For Tool-augmented Llms Minghao Li et al.
- A Simple And Effective Pruning Approach For Large Language Models Mingjie Sun, Zhuang Liu, Anna Bair, J. Zico Kolter
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan
- Codekgc: Code Language Model For Generative Knowledge Graph Construction Zhen Bi et al.
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- Large Language Models Are Effective Text Rankers With Pairwise Ranking Prompting Zhen Qin et al.
- Psy-llm: Scaling Up Global Mental Health Psychological Services With Ai-based Large Language Models Tin Lai et al.
- Empirical Study Of Zero-shot NER With Chatgpt Tingyu Xie et al.
- Large Language Models Are State-of-the-art Evaluators Of Translation Quality Tom Kocmi, Christian Federmann
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Enabling Large Language Models To Generate Text With Citations Tianyu Gao, Howard Yen, Jiatong Yu, Danqi Chen
- Large Language Model Alignment: A Survey Tianhao Shen et al.
- Red Teaming Chatgpt Via Jailbreaking: Bias, Robustness, Reliability And Toxicity Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing
- Hallusionbench: An Advanced Diagnostic Suite For Entangled Language Hallucination And Visual Illusion In Large Vision-language Models Tianrui Guan et al.
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- Is Chatgpt A Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation Tao Fang et al.
- Open-ended Medical Visual Question Answering Through Prefix Tuning Of Language Models Tom Van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Chatgpt: Beginning Of An End Of Manual Linguistic Data Annotation? Use Case Of Automatic Genre Identification Taja Kuzman, Igor Mozetič, Nikola Ljubešić
- Evallm: Interactive Evaluation Of Large Language Model Prompts On User-defined Criteria Tae Soo Kim, Yoonjoo Lee, Jamin Shin, Young-ho Kim, Juho Kim
- Large Language Models As General Pattern Machines Suvir Mirchandani et al.
- Observations On Llms For Telecom Domain: Capabilities And Limitations Sumit Soman, Ranjani H G
- Orca: Progressive Learning From Complex Explanation Traces Of GPT-4 Subhabrata Mukherjee et al.
- Transformative Effects Of Chatgpt On Modern Education: Emerging Era Of AI Chatbots Sukhpal Singh Gill et al.
- Analyzing The Performance Of GPT-3.5 And GPT-4 In Grammatical Error Correction Steven Coyne, Keisuke Sakaguchi, Diana Galvan-sosa, Michael Zock, Kentaro Inui
- Pretraining Language Models With Human Preferences Tomasz Korbak et al.
- AI, Write An Essay For Me: A Large-scale Comparison Of Human-written Versus Chatgpt-generated Essays Steffen Herbold, Annette Hautli-janisz, Ute Heuer, Zlata Kikteva, Alexander Trautsch
- Expressive Text-to-image Generation With Rich Text Songwei Ge, Taesung Park, Jun-yan Zhu, Jia-bin Huang
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Llm-empowered Chatbots For Psychiatrist And Patient Simulation: Application And Evaluation Siyuan Chen et al.
- Metagpt: Meta Programming For A Multi-agent Collaborative Framework Sirui Hong et al.
- On The Possibilities Of Ai-generated Text Detection Souradip Chakraborty et al.
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- Wikichat: Stopping The Hallucination Of Large Language Model Chatbots By Few-shot Grounding On Wikipedia Sina J. Semnani, Violet Z. Yao, Heidi C. Zhang, Monica S. Lam
- Mind Meets Machine: Unravelling Gpt-4's Cognitive Psychology Sifatkaur Dhingra, Manmeet Singh, Vaisakh Sb, Neetiraj Malviya, Sukhpal Singh Gill
- Mitigating Object Hallucinations In Large Vision-language Models Through Visual Contrastive Decoding Sicong Leng et al.
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- From Words To Watts: Benchmarking The Energy Costs Of Large Language Model Inference Siddharth Samsi et al.
- Inpars-v2: Large Language Models As Efficient Dataset Generators For Information Retrieval Vitor Jeronymo et al.
- Chatgpt Beyond English: Towards A Comprehensive Evaluation Of Large Language Models In Multilingual Learning Viet Dac Lai et al.
- Is GPT-4 A Reliable Rater? Evaluating Consistency In GPT-4 Text Ratings Veronika Hackl, Alexandra Elena Müller, Michael Granitzer, Maximilian Sailer
- Fully Autonomous Programming With Large Language Models Vadim Liventsev, Anastasiia Grishina, Aki Härmä, Leon Moonen
- Evaluating Correctness And Faithfulness Of Instruction-following Models For Question Answering Vaibhav Adlakha, Parishad Behnamghader, Xing Han Lu, Nicholas Meade, Siva Reddy
- Automating Human Tutor-style Programming Feedback: Leveraging GPT-4 Tutor Model For Hint Generation And GPT-3.5 Student Model For Hint Validation Tung Phung et al.
- Generative AI For Programming Education: Benchmarking Chatgpt, GPT-4, And Human Tutors Tung Phung et al.
- Freshllms: Refreshing Large Language Models With Search Engine Augmentation Tu Vu et al.
- Art Or Artifice? Large Language Models And The False Promise Of Creativity Tuhin Chakrabarty, Philippe Laban, Divyansh Agarwal, Smaranda Muresan, Chien-sheng Wu
- Large Language Models Fail On Trivial Alterations To Theory-of-mind Tasks Tomer Ullman
- Trusting Your Evidence: Hallucinate Less With Context-aware Decoding Weijia Shi et al.
- Mm-vet: Evaluating Large Multimodal Models For Integrated Capabilities Weihao Yu et al.
- Promptcblue: A Chinese Prompt Tuning Benchmark For The Medical Domain Wei Zhu, Xiaoling Wang, Huanran Zheng, Mosha Chen, Buzhou Tang
- Copiloting The Copilots: Fusing Large Language Models With Completion Engines For Automated Program Repair Yuxiang Wei, Chunqiu Steven Xia, Lingming Zhang
- A Preliminary Evaluation Of Chatgpt For Zero-shot Dialogue Understanding Wenbo Pan, Qiguang Chen, Xiao Xu, Wanxiang Che, Libo Qin
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- BLIVA: A Simple Multimodal LLM For Better Handling Of Text-rich Visual Questions Wenbo Hu et al.
- Roco: Dialectic Multi-robot Collaboration With Large Language Models Zhao Mandi, Shreeya Jain, Shuran Song
- Large Language Models As Zero-shot Conversational Recommenders Zhankui He et al.
- Chatgpt For PLC/DCS Control Logic Generation Heiko Koziolek, Sten Gruener, Virendra Ashiwal
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine Harsha Nori et al.
- Not All Languages Are Created Equal In Llms: Improving Multilingual Capability By Cross-lingual-thought Prompting Haoyang Huang et al.
- Ferret: Refer And Ground Anything Anywhere At Any Granularity Haoxuan You et al.
- Improved Baselines With Visual Instruction Tuning Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Extractive Summarization Via Chatgpt For Faithful Summary Generation Haopeng Zhang, Xiao Liu, Jiawei Zhang
- Chatgpt Or Grammarly? Evaluating Chatgpt On Grammatical Error Correction Benchmark Haoran Wu, Wenxuan Wang, Yuxuan Wan, Wenxiang Jiao, Michael Lyu
- CMMLU: Measuring Massive Multitask Language Understanding In Chinese Haonan Li et al.
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Autodroid: Llm-powered Task Automation In Android Hao Wen et al.
- Lmdrive: Closed-loop End-to-end Driving With Large Language Models Hao Shao et al.
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Chain Of Hindsight Aligns Language Models With Feedback Hao Liu, Carmelo Sferrazza, Pieter Abbeel
- CRITIC: Large Language Models Can Self-correct With Tool-interactive Critiquing Zhibin Gou et al.
- Visual-language Prompt Tuning With Knowledge-guided Context Optimization Hantao Yao, Rui Zhang, Changsheng Xu
- Glamm: Pixel Grounding Large Multimodal Model Hanoona Rasheed et al.
- Personallm: Investigating The Ability Of Large Language Models To Express Personality Traits Hang Jiang et al.
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Prompting Large Language Models For Topic Modeling Han Wang et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Wizardmath: Empowering Mathematical Reasoning For Large Language Models Via Reinforced Evol-instruct Haipeng Luo et al.
- Revisiting Large Language Models As Zero-shot Relation Extractors Guozheng Li, Peng Wang, Wenjun Ke
- Gender Bias And Stereotypes In Large Language Models Hadas Kotek, Rikker Dockum, David Q. Sun
- Perspectives On Large Language Models For Relevance Judgment Guglielmo Faggioli et al.
- Augmented Language Models: A Survey Grégoire Mialon et al.
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Performance Of The Pre-trained Large Language Model GPT-4 On Automated Short Answer Grading Gerd Kortemeyer
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Lawbench: Benchmarking Legal Knowledge Of Large Language Models Zhiwei Fei et al.
- Gemini: A Family Of Highly Capable Multimodal Models Gemini Team et al.
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Large Language Models Can Be Easily Distracted By Irrelevant Context Freda Shi et al.
- Repocoder: Repository-level Code Completion Through Iterative Retrieval And Generation Fengji Zhang et al.
- Exploring Human-like Translation Strategy With Large Language Models Zhiwei He et al.
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Empower Large Language Model To Perform Better On Industrial Domain-specific Question Answering Fangkai Yang et al.
- Learning To Reason Over Scene Graphs: A Case Study Of Finetuning GPT-2 Into A Robot Language Model For Grounded Task Planning Georgia Chalvatzaki et al.
- Moviechat: From Dense Token To Sparse Memory For Long Video Understanding Enxin Song et al.
- Lasuie: Unifying Information Extraction With Latent Adaptive Structure-aware Generative Language Model Hao Fei et al.
- Aligning Large Multimodal Models With Factually Augmented RLHF Zhiqing Sun et al.
- Simulating H.P. Lovecraft Horror Literature With The Chatgpt Large Language Model Eduardo C. Garrido-merchán, José Luis Arroyo-barrigüete, Roberto Gozalo-brizuela
- Gptutor: A Chatgpt-powered Programming Tool For Code Explanation Eason Chen, Ray Huang, Han-shin Chen, Yuen-hsien Tseng, Liang-yi Li
- The Falcon Series Of Open Language Models Ebtesam Almazrouei et al.
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- MELTR: Meta Loss Transformer For Learning To Fine-tune Video Foundation Models Dohwan Ko et al.
- Evaluating Open-domain Question Answering In The Era Of Large Language Models Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei
- Chatgpt Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions Deyao Zhu et al.
- Text-to-sql Empowered By Large Language Models: A Benchmark Evaluation Dawei Gao et al.
- Palm-e: An Embodied Multimodal Language Model Danny Driess et al.
- Adapted Large Language Models Can Outperform Medical Experts In Clinical Text Summarization Dave Van Veen et al.
- Have Llms Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models Daman Arora, Himanshu Gaurav Singh, Mausam
- Improving Accuracy Of GPT-3/4 Results On Biomedical Data Using A Retrieval-augmented Language Model David Soong et al.
- Llava-med: Training A Large Language-and-vision Assistant For Biomedicine In One Day Chunyuan Li et al.
- Chatgpt Evaluation On Sentence Level Relations: A Focus On Temporal, Causal, And Discourse Relations Chunkit Chan et al.
- Progressive-hint Prompting Improves Reasoning In Large Language Models Chuanyang Zheng, Zhengying Liu, Enze Xie, Zhenguo Li, Yu Li
- Drivelm: Driving With Graph Visual Question Answering Chonghao Sima et al.
- Whitefox: White-box Compiler Fuzzing Empowered By Large Language Models Chenyuan Yang et al.
- Chateval: Towards Better Llm-based Evaluators Through Multi-agent Debate Chi-min Chan et al.
- Distilled GPT For Source Code Summarization Chia-yi Su, Collin Mcmillan
- Llm-powered Data Augmentation For Enhanced Cross-lingual Performance Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- Can Large Language Models Be An Alternative To Human Evaluations? Cheng-han Chiang, Hung-yi Lee
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Dipping Plms Sauce: Bridging Structure And Text For Effective Knowledge Graph Completion Via Conditional Soft Prompting Chen Chen, Yufei Wang, Aixin Sun, Bing Li, Kwok-yan Lam
- MME: A Comprehensive Evaluation Benchmark For Multimodal Large Language Models Chaoyou Fu et al.
- Hallucination Augmented Contrastive Learning For Multimodal Large Language Model Chaoya Jiang et al.
- K2: A Foundation Language Model For Geoscience Knowledge Understanding And Utilization Cheng Deng et al.
- Llmseceval: A Dataset Of Natural Language Prompts For Security Evaluations Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, Riccardo Scandariato
- A Confederacy Of Models: A Comprehensive Evaluation Of Llms On Creative Writing Carlos Gómez-rodríguez, Paul Williams
- Wizardlm: Empowering Large Language Models To Follow Complex Instructions Can Xu et al.
- Compositional Chain-of-thought Prompting For Large Multimodal Models Chancharik Mitra, Brandon Huang, Trevor Darrell, Roei Herzig
- Pmc-llama: Towards Building Open-source Language Models For Medicine Chaoyi Wu et al.
- Reinforced Self-training (rest) For Language Modeling Caglar Gulcehre et al.
- Large Language Models On Graphs: A Comprehensive Survey Bowen Jin et al.
- Prompting Or Fine-tuning? A Comparative Study Of Large Language Models For Taxonomy Construction Boqi Chen, Fandi Yi, Dániel Varró
- LLM+P: Empowering Large Language Models With Optimal Planning Proficiency Bo Liu et al.
- MIMIC-IT: Multi-modal In-context Instruction Tuning Bo Li et al.
- Seed-bench-2: Benchmarking Multimodal Large Language Models Bohao Li et al.
- Video-llava: Learning United Visual Representation By Alignment Before Projection Bin Lin et al.
- Vtimellm: Empower LLM To Grasp Video Moments Bin Huang, Xin Wang, Hong Chen, Zihan Song, Wenwu Zhu
- Evaluation Of Chatgpt For Nlp-based Mental Health Applications Bishal Lamichhane
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- ART: Automatic Multi-step Reasoning And Tool-use For Large Language Models Bhargavi Paranjape et al.
- Can Large Language Models Transform Computational Social Science? Caleb Ziems et al.
- Code Llama: Open Foundation Models For Code Baptiste Rozière et al.
- Instruction Tuning With GPT-4 Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- How Close Is Chatgpt To Human Experts? Comparison Corpus, Evaluation, And Detection Biyang Guo et al.
- Coupling Large Language Models With Logic Programming For Robust And General Reasoning From Text Zhun Yang, Adam Ishay, Joohyung Lee
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- Refactoring Programs Using Large Language Models With Few-shot Examples Atsushi Shirafuji, Yusuke Oda, Jun Suzuki, Makoto Morishita, Yutaka Watanobe
- The False Promise Of Imitating Proprietary Llms Arnav Gudibande et al.
- Orca 2: Teaching Small Language Models How To Reason Arindam Mitra et al.
- Better Zero-shot Reasoning With Role-play Prompting Aobo Kong et al.
- RT-2: Vision-language-action Models Transfer Web Knowledge To Robotic Control Anthony Brohan et al.
- Med-halt: Medical Domain Hallucination Test For Large Language Models Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu
- Interpretable Long-form Legal Question Answering With Retrieval-augmented Large Language Models Antoine Louis, Gijs Van Dijck, Gerasimos Spanakis
- Detecting And Preventing Hallucinations In Large Vision Language Models Anisha Gunjal, Jihan Yin, Erhan Bas
- Toxicchat: Unveiling Hidden Challenges Of Toxicity Detection In Real-world User-ai Conversation Zi Lin et al.
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- Chemcrow: Augmenting Large-language Models With Chemistry Tools Andres M Bran et al.
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models Anas Awadalla et al.
- How Good Are GPT Models At Machine Translation? A Comprehensive Evaluation Amr Hendy et al.
- The Impact Of Positional Encoding On Length Generalization In Transformers Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
- On Generative Agents In Recommendation An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-seng Chua
- Self-refine: Iterative Refinement With Self-feedback Aman Madaan et al.
- Lamp: When Large Language Models Meet Personalization Alireza Salemi, Sheshera Mysore, Michael Bendersky, Hamed Zamani
- Model Tuning Or Prompt Tuning? A Study Of Large Language Models For Clinical Concept And Relation Extraction Cheng Peng et al.
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- Baichuan 2: Open Large-scale Language Models Aiyuan Yang et al.
- Mistral 7B Albert Q. Jiang et al.
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- RTLLM: An Open-source Benchmark For Design RTL Generation With Large Language Model Yao Lu, Shang Liu, Qijun Zhang, Zhiyao Xie
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- Beyond Chain-of-thought, Effective Graph-of-thought Reasoning In Language Models Yao Yao, Zuchao Li, Hai Zhao
- Chatpose: Chatting About 3D Human Pose Yao Feng et al.
- Powerinfer: Fast Large Language Model Serving With A Consumer-grade GPU Yixin Song, Zeyu Mi, Haotong Xie, Haibo Chen
- Better To Ask In English: Cross-lingual Evaluation Of Large Language Models For Healthcare Queries Yiqiao Jin et al.
- On Learning To Summarize With Large Language Models As References Yixin Liu et al.
- Flexgen: High-throughput Generative Inference Of Large Language Models With A Single GPU Ying Sheng et al.
- Element-aware Summarization With Large Language Models: Expert-aligned Evaluation And Chain-of-thought Method Yiming Wang, Zhuosheng Zhang, Rui Wang
- Can Chatgpt Replace Traditional KBQA Models? An In-depth Analysis Of The Question Answering Performance Of The GPT LLM Family Yiming Tan et al.
- Evaluating Object Hallucination In Large Vision-language Models Yifan Li et al.
- "kelly Is A Warm Person, Joseph Is A Role Model": Gender Biases In Llm-generated Reference Letters Yixin Wan et al.
- Llm-eval: Unified Multi-dimensional Automatic Evaluation For Open-domain Conversations With Large Language Models Yen-ting Lin, Yun-nung Chen
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- How Far Can Camels Go? Exploring The State Of Instruction Tuning On Open Resources Yizhong Wang et al.
- Analyzing And Mitigating Object Hallucination In Large Vision-language Models Yiyang Zhou et al.
- Llavar: Enhanced Visual Instruction Tuning For Text-rich Image Understanding Yanzhe Zhang et al.
- March In Chat: Interactive Prompting For Remote Embodied Referring Expression Yanyuan Qiao, Yuankai Qi, Zheng Yu, Jing Liu, Qi Wu
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Llama-vid: An Image Is Worth 2 Tokens In Large Language Models Yanwei Li, Chengyao Wang, Jiaya Jia
- G-eval: NLG Evaluation Using GPT-4 With Better Human Alignment Yang Liu et al.
- Specinfer: Accelerating Generative Large Language Model Serving With Tree-based Speculative Inference And Verification Xupeng Miao et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- A Survey On Model Compression For Large Language Models Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang
- Emotional Intelligence Of Large Language Models Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Liu Jia
- Fine-tuning Llama For Multi-stage Text Retrieval Xueguang Ma, Liang Wang, Nan Yang, Furu Wei, Jimmy Lin
- Classeval: A Manually-crafted Benchmark For Evaluating Llms On Class-level Code Generation Xueying Du et al.
- Integrating Action Knowledge And Llms For Task Planning And Situation Handling In Open Worlds Yan Ding et al.
- Improving Language Model Negotiation With Self-play And In-context Learning From AI Feedback Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata
- Teaching Large Language Models To Self-debug Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou
- Mitigating Large Language Model Hallucinations Via Autonomous Knowledge Graph-based Retrofitting Xinyan Guan et al.
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- Rethinking The Evaluation For Conversational Recommendation In The Era Of Large Language Models Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-rong Wen
- PMC-VQA: Visual Instruction Tuning For Medical Visual Question Answering Xiaoman Zhang et al.
- LISA: Reasoning Segmentation Via Large Language Model Xin Lai et al.
- Summarization Is (almost) Dead Xiao Pu, Mingqi Gao, Xiaojun Wan
- HPC-GPT: Integrating Large Language Model For High-performance Computing Xianzhong Ding et al.
- MMMU: A Massive Multi-discipline Multimodal Understanding And Reasoning Benchmark For Expert AGI Xiang Yue et al.
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Pali-3 Vision Language Models: Smaller, Faster, Stronger Xi Chen et al.
- Cogagent: A Visual Language Model For GUI Agents Wenyi Hong et al.
- M3exam: A Multilingual, Multimodal, Multilevel Benchmark For Examining Large Language Models Wenxuan Zhang, Sharifah Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing
- Is Chatgpt A Good Translator? Yes With GPT-4 As The Engine Wenxiang Jiao et al.
- Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Sentence Simplification Via Large Language Models Yutao Feng, Jipeng Qiang, Yun Li, Yunhao Yuan, Yi Zhu
- Longbench: A Bilingual, Multitask Benchmark For Long Context Understanding Yushi Bai et al.
- Editing Large Language Models: Problems, Methods, And Opportunities Yunzhi Yao et al.
- Lampilot: An Open Benchmark Dataset For Autonomous Driving With Language Model Programs Yunsheng Ma et al.
- Tool Learning With Foundation Models Yujia Qin et al.
- Exploring The Impact Of Instruction Data Scaling On Large Language Models: An Empirical Study On Real-world Use Cases Yunjie Ji et al.
- Contextual Object Detection With Multimodal Large Language Models Yuhang Zang, Wei Li, Jun Han, Kaiyang Zhou, Chen Change Loy
- Aligning Large Language Models With Human: A Survey Yufei Wang et al.
- Toolqa: A Dataset For LLM Question Answering With External Tools Yuchen Zhuang, Yue Yu, Kuan Wang, Haotian Sun, Chao Zhang
- Preventing Zero-shot Transfer Degradation In Continual Learning Of Vision-language Models Zangwei Zheng et al.
- MEDITRON-70B: Scaling Medical Pretraining For Large Language Models Zeming Chen et al.
- Is Chatgpt A Good Sentiment Analyzer? A Preliminary Study Zengzhi Wang et al.
- Fine-grained Human Feedback Gives Better Rewards For Language Model Training Zeqiu Wu et al.
- Large Language Models In Healthcare And Medical Domain: A Review Zabir Al Nazi, Wei Peng
- C-eval: A Multi-level Multi-discipline Chinese Evaluation Suite For Foundation Models Yuzhen Huang et al.
- Learning Gain Differences Between Chatgpt And Human Tutor Generated Algebra Hints Zachary A. Pardos, Shreya Bhandari
- Let The Llms Talk: Simulating Human-to-human Conversational QA Via Zero-shot Llm-to-llm Interactions Zahra Abbasiantaeb, Yifei Yuan, Evangelos Kanoulas, Mohammad Aliannejadi
- Llava-mr: Large Language-and-vision Assistant For Video Moment Retrieval Weiheng Lu et al.
- Chatbot Arena: An Open Platform For Evaluating Llms By Human Preference Wei-lin Chiang et al.
- Billm: Pushing The Limit Of Post-training Quantization For Llms Wei Huang et al.
- Continual Learning For Large Language Models: A Survey Tongtong Wu et al.
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- Towards Conversational Diagnostic AI Tao Tu et al.
- Adaptmllm: Fine-tuning Multilingual Language Models On Low-resource Languages With Integrated LLM Playgrounds Séamus Lankford, Haithem Afli, Andy Way
- Who Validates The Validators? Aligning Llm-assisted Evaluation Of LLM Outputs With Human Preferences Shreya Shankar, J. D. Zamfirescu-pereira, Björn Hartmann, Aditya G. Parameswaran, Ian Arawjo
- Eyes Wide Shut? Exploring The Visual Shortcomings Of Multimodal Llms Shengbang Tong et al.
- Hidden Flaws Behind Expert-level Accuracy Of Multimodal GPT-4 Vision In Medicine Qiao Jin et al.
- Me Llama: Foundation Large Language Models For Medical Applications Qianqian Xie et al.
- SNIFFER: Multimodal Large Language Model For Explainable Out-of-context Misinformation Detection Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee
- Ai-augmented Brainwriting: Investigating The Use Of Llms In Group Ideation Orit Shaer, Angelora Cooper, Osnat Mokryn, Andrew L. Kun, Hagit Ben Shoshan
- Jamba: A Hybrid Transformer-mamba Language Model Opher Lieber et al.
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- CBR-RAG: Case-based Reasoning For Retrieval Augmented Generation In Llms For Legal Question Answering Nirmalie Wiratunga et al.
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- The Effect Of Sampling Temperature On Problem Solving In Large Language Models Matthew Renze, Erhan Guven
- Language Models For Code Completion: A Practical Evaluation Maliheh Izadi et al.
- Linrec: Linear Attention Mechanism For Long-term Sequential Recommender Systems Langming Liu et al.
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Data Is All You Need: Finetuning Llms For Chip Design Via An Automated Design-data Augmentation Framework Kaiyan Chang et al.
- The Dawn After The Dark: An Empirical Study On Factuality Hallucination In Large Language Models Junyi Li et al.
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Benchmarking Retrieval-augmented Generation For Medicine Guangzhi Xiong, Qiao Jin, Zhiyong Lu, Aidong Zhang
- Closing The Gap Between Open-source And Commercial Large Language Models For Medical Evidence Summarization Gongbo Zhang et al.
- Building Better AI Agents: A Provocation On The Utilisation Of Persona In Llm-based Conversational Agents Guangzhi Sun, Xiao Zhan, Jose Such
- Gemma: Open Models Based On Gemini Research And Technology Gemma Team et al.
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context Gemini Team et al.
- Code-aware Prompting: A Study Of Coverage Guided Test Generation In Regression Setting Using LLM Gabriel Ryan et al.
- Ai-tutoring In Software Engineering Education Eduard Frankford, Clemens Sauerwein, Patrick Bassner, Stephan Krusche, Ruth Breu
- Olmo: Accelerating The Science Of Language Models Dirk Groeneveld et al.
- Chemllm: A Chemical Large Language Model Di Zhang et al.
- Deepseek-v2: A Strong, Economical, And Efficient Mixture-of-experts Language Model Deepseek-ai et al.
- The Revolution Of Multimodal Large Language Models: A Survey Davide Caffagni et al.
- Deepseek-coder: When The Large Language Model Meets Programming -- The Rise Of Code Intelligence Daya Guo et al.
- Open Source Language Models Can Provide Feedback: Evaluating Llms' Ability To Help Students Using Gpt-4-as-a-judge Charles Koutcheme et al.
- MM1: Methods, Analysis & Insights From Multimodal LLM Pre-training Brandon Mckinzie et al.
- Moe-llava: Mixture Of Experts For Large Vision-language Models Bin Lin et al.
- Gemini Goes To Med School: Exploring The Capabilities Of Multimodal Large Language Models On Medical Challenge Problems & Hallucinations Ankit Pal, Malaikannan Sankarasubbu
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- Optimization Methods For Personalizing Large Language Models Through Retrieval Augmentation Alireza Salemi, Surya Kallumadi, Hamed Zamani
- Yi: Open Foundation Models By 01.AI 01. Ai et al.
- Quality Of Answers Of Generative Large Language Models Vs Peer Patients For Interpreting Lab Test Results For Lay Patients: Evaluation Study Zhe He et al.
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites Zhe Chen et al.
- Llmparser: An Exploratory Study On Using Large Language Models For Log Parsing Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-hsun Chen, Shaowei Wang
- Hallucination Detection: Robustly Discerning Reliable Answers In Large Language Models Yuyan Chen et al.
- Large Language Model (LLM) AI Text Generation Detection Based On Transformer Deep Learning Algorithm Yuhong Mo, Hao Qin, Yushan Dong, Ziyi Zhu, Zhenglin Li
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Large Language Models In Mental Health Care: A Scoping Review Yining Hua et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Biomistral: A Collection Of Open-source Pretrained Large Language Models For Medical Domains Yanis Labrak et al.
- Datasets For Large Language Models: A Comprehensive Survey Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Mgte: Generalized Long-context Text Representation And Reranking Models For Multilingual Text Retrieval Xin Zhang et al.
- CRUD-RAG: A Comprehensive Chinese Benchmark For Retrieval-augmented Generation Of Large Language Models Yuanjie Lyu et al.
- Measurement Of Llm's Philosophies Of Human Nature Minheng Ni et al.
- Can Generative Llms Create Query Variants For Test Collections? An Exploratory Study Marwah Alaofi, Luke Gallagher, Mark Sanderson, Falk Scholer, Paul Thomas
- Findings Of The Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Alex Warstadt et al.
🏷 Fairness
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Improving Gender Fairness Of Pre-trained Language Models Without Catastrophic Forgetting Zahra Fatemi, Chen Xing, Wenhao Liu, Caiming Xiong
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- Perturbation Augmentation For Fairer NLP Rebecca Qian et al.
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Quantifying Memorization Across Neural Language Models Nicholas Carlini et al.
- Holistic Evaluation Of Language Models Percy Liang et al.
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Facilitating Self-guided Mental Health Interventions Through Human-language Model Interaction: A Case Study Of Cognitive Restructuring Ashish Sharma, Kevin Rushton, Inna Wanyin Lin, Theresa Nguyen, Tim Althoff
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Language Model Tokenizers Introduce Unfairness Between Languages Aleksandar Petrov, Emanuele La Malfa, Philip H. S. Torr, Adel Bibi
- Better To Ask In English: Cross-lingual Evaluation Of Large Language Models For Healthcare Queries Yiqiao Jin et al.
- "kelly Is A Warm Person, Joseph Is A Role Model": Gender Biases In Llm-generated Reference Letters Yixin Wan et al.
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Mapping The Ethics Of Generative AI: A Comprehensive Scoping Review Thilo Hagendorff
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
🏷 Few-Shot
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- CG-BERT: Conditional Text Generation With BERT For Generalized Few-shot Intent Detection Congying Xia, Chenwei Zhang, Hoang Nguyen, Jiawei Zhang, Philip Yu
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Few-shot Generative Conversational Query Rewriting Shi Yu et al.
- Few-shot Text Generation With Pattern-exploiting Training Timo Schick, Hinrich Schütze
- It's Not Just Size That Matters: Small Language Models Are Also Few-shot Learners Timo Schick, Hinrich Schütze
- Logic2text: High-fidelity Natural Language Generation From Logical Forms Zhiyu Chen et al.
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- Few-shot Natural Language Generation For Task-oriented Dialog Baolin Peng et al.
- The Turking Test: Can Language Models Understand Instructions? Avia Efrat, Omer Levy
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- Language Models As Few-shot Learner For Task-oriented Dialogue Systems Andrea Madotto, Zihan Liu, Zhaojiang Lin, Pascale Fung
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- Language Models Are Few-shot Learners Tom B. Brown et al.
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Few-shot Learning With Multilingual Language Models Xi Victoria Lin et al.
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- GPT Understands, Too Xiao Liu et al.
- Learning How To Ask: Querying Lms With Mixtures Of Soft Prompts Guanghui Qin, Jason Eisner
- Language Models Are Few-shot Multilingual Learners Genta Indra Winata et al.
- Cutting Down On Prompts And Parameters: Simple Few-shot Learning With Language Models Robert L. Iv Logan et al.
- Reframing Instructional Prompts To Gptk's Language Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
- Crossing The Conversational Chasm: A Primer On Natural Language Processing For Multilingual Task-oriented Dialogue Systems Evgeniia Razumovskaia et al.
- True Few-shot Learning With Language Models Ethan Perez, Douwe Kiela, Kyunghyun Cho
- Prompt Programming For Large Language Models: Beyond The Few-shot Paradigm Laria Reynolds, Kyle Mcdonell
- Less Is More: Pre-train A Strong Text Encoder For Dense Retrieval Using A Weak Decoder Shuqi Lu et al.
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Entailment As Few-shot Learner Sinong Wang, Han Fang, Madian Khabsa, Hanzi Mao, Hao Ma
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- GPT-3 Models Are Poor Few-shot Learners In The Biomedical Domain Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Efficient Large Scale Language Modeling With Mixtures Of Experts Mikel Artetxe et al.
- PTR: Prompt Tuning With Rules For Text Classification Xu Han, Weilin Zhao, Ning Ding, Zhiyuan Liu, Maosong Sun
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- Improving And Simplifying Pattern Exploiting Training Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, Colin Raffel
- PPT: Pre-trained Prompt Tuning For Few-shot Learning Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- True Few-shot Learning With Prompts -- A Real-world Perspective Timo Schick, Hinrich Schütze
- What To Pre-train On? Efficient Intermediate Task Selection Clifton Poth, Jonas Pfeiffer, Andreas Rücklé, Iryna Gurevych
- Exploring Prompt-based Few-shot Learning For Grounded Dialog Generation Chujie Zheng, Minlie Huang
- LAION-400M: Open Dataset Of Clip-filtered 400 Million Image-text Pairs Christoph Schuhmann et al.
- LFPT5: A Unified Framework For Lifelong Few-shot Language Learning Based On Prompt Tuning Of T5 Chengwei Qin, Shafiq Joty
- Prompting Visual-language Models For Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
- Pre-train, Prompt, And Predict: A Systematic Survey Of Prompting Methods In Natural Language Processing Pengfei Liu et al.
- RAFT: A Real-world Few-shot Text Classification Benchmark Neel Alex et al.
- Fantastically Ordered Prompts And Where To Find Them: Overcoming Few-shot Prompt Order Sensitivity Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp
- The Power Of Scale For Parameter-efficient Prompt Tuning Brian Lester, Rami Al-rfou, Noah Constant
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners Ningyu Zhang et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- Towards Few-shot Fact-checking Via Perplexity Nayeon Lee, Yejin Bang, Andrea Madotto, Madian Khabsa, Pascale Fung
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- Wordcraft: A Human-ai Collaborative Editor For Story Writing Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan
- Few-shot Bot: Prompt-based Learning For Dialogue Systems Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung
- Revisiting Self-training For Few-shot Learning Of Language Model Yiming Chen et al.
- Few-shot Question Answering By Pretraining Span Selection Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy
- An Empirical Study Of GPT-3 For Few-shot Knowledge-based VQA Zhengyuan Yang et al.
- Do Prompt-based Models Really Understand The Meaning Of Their Prompts? Albert Webson, Ellie Pavlick
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- FLEX: Unifying Evaluation For Few-shot NLP Jonathan Bragg, Arman Cohan, Kyle Lo, Iz Beltagy
- Constrained Language Models Yield Few-shot Semantic Parsers Richard Shin et al.
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- Metaicl: Learning To Learn In Context Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi
- Pangu-\(α\): Large-scale Autoregressive Pretrained Chinese Language Models With Auto-parallel Computation Wei Zeng et al.
- Multimodal Few-shot Learning With Frozen Language Models Maria Tsimpoukelli et al.
- What Makes Good In-context Examples For GPT-\(3\)? Jiachang Liu et al.
- Finetuned Language Models Are Zero-shot Learners Jason Wei et al.
- Show Your Work: Scratchpads For Intermediate Computation With Language Models Maxwell Nye et al.
- GALAXY: A Generative Pre-trained Model For Task-oriented Dialog With Semi-supervised Learning And Explicit Policy Injection Wanwei He et al.
- Program Synthesis With Large Language Models Jacob Austin et al.
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Few-shot Knowledge Graph-to-text Generation With Pretrained Language Models Junyi Li et al.
- Reframing Human-ai Collaboration For Generating Free-text Explanations Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi
- Scaling Instruction-finetuned Language Models Hyung Won Chung et al.
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- On The Effect Of Pretraining Corpora On In-context Learning By A Large-scale Language Model Seongjin Shin et al.
- Selective Annotation Makes Language Models Better Few-shot Learners Hongjin Su et al.
- Demystifying Prompts In Language Models Via Perplexity Estimation Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer
- Few-shot Parameter-efficient Fine-tuning Is Better And Cheaper Than In-context Learning Haokun Liu et al.
- Program Of Thoughts Prompting: Disentangling Computation From Reasoning For Numerical Reasoning Tasks Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen
- In-context Learning For Few-shot Dialogue State Tracking Yushi Hu et al.
- Large Language Models Are Few(1)-shot Table Reasoners Wenhu Chen
- Decoupling Knowledge From Memorization: Retrieval-augmented Prompt Learning Xiang Chen et al.
- How To Prompt? Opportunities And Challenges Of Zero- And Few-shot Learning For Human-ai Interaction In Creative Applications Of Generative Models Hai Dang, Lukas Mecke, Florian Lehmann, Sven Goller, Daniel Buschek
- The Unreliability Of Explanations In Few-shot Prompting For Textual Reasoning Xi Ye, Greg Durrett
- Atlas: Few-shot Learning With Retrieval Augmented Language Models Gautier Izacard et al.
- Prototypical Verbalizer For Prompt-based Few-shot Tuning Ganqu Cui, Shengding Hu, Ning Ding, Longtao Huang, Zhiyuan Liu
- Data Augmentation For Intent Classification With Off-the-shelf Large Language Models Gaurav Sahu et al.
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- Alexatm 20B: Few-shot Learning Using A Large-scale Multilingual Seq2seq Model Saleh Soltan et al.
- Leveraging Large Language Models For Multiple Choice Question Answering Joshua Robinson, Christopher Michael Rytting, David Wingate
- OPT-IML: Scaling Language Model Instruction Meta Learning Through The Lens Of Generalization Srinivasan Iyer et al.
- Can Large Language Models Truly Understand Prompts? A Case Study With Negated Prompts Joel Jang, Seonghyeon Ye, Minjoon Seo
- Generating Sequences By Learning To Self-correct Sean Welleck et al.
- Instruction Tuning For Few-shot Aspect-based Sentiment Analysis Siddharth Varia et al.
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- Flamingo: A Visual Language Model For Few-shot Learning Jean-baptiste Alayrac et al.
- Visconde: Multi-document QA With GPT-3 And Neural Reranking Jayr Pereira, Robson Fidalgo, Roberto Lotufo, Rodrigo Nogueira
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- RARR: Researching And Revising What Language Models Say, Using Language Models Luyu Gao et al.
- PAL: Program-aided Language Models Luyu Gao et al.
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- Instructionner: A Multi-task Instruction-based Generative Framework For Few-shot NER Liwen Wang et al.
- Efficient Few-shot Learning Without Prompts Lewis Tunstall et al.
- Data Distributional Properties Drive Emergent In-context Learning In Transformers Stephanie C. Y. Chan et al.
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- OPT: Open Pre-trained Transformer Language Models Susan Zhang et al.
- Can Large Language Models Reason About Medical Questions? Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther
- Towards Using Few-shot Prompt Learning For Automating Model Completion Meriem Ben Chaaben, Lola Burgueño, Houari Sahraoui
- Language Models With Image Descriptors Are Strong Few-shot Video-language Learners Zhenhailong Wang et al.
- Deplot: One-shot Visual Language Reasoning By Plot-to-table Translation Fangyu Liu et al.
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Legal Prompting: Teaching A Language Model To Think Like A Lawyer Fangyi Yu, Lee Quartey, Frank Schilder
- Star: Bootstrapping Reasoning With Reasoning Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman
- IGLUE: A Benchmark For Transfer Learning Across Modalities, Tasks, And Languages Emanuele Bugliarello et al.
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- Successive Prompting For Decomposing Complex Questions Dheeru Dua, Shivanshu Gupta, Sameer Singh, Matt Gardner
- Self-adaptive In-context Learning: An Information Compression Perspective For In-context Example Selection And Ordering Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong
- Factpegasus: Factuality-aware Pre-training And Fine-tuning For Abstractive Summarization David Wan, Mohit Bansal
- Prompting Palm For Translation: Assessing Strategies And Performance David Vilar et al.
- Adaprompt: Adaptive Model Training For Prompt-based NLP Yulong Chen et al.
- Language Model Cascades David Dohan et al.
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Unified Vision And Language Prompt Learning Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
- Prompt For Extraction? PAIE: Prompting Argument Interaction For Event Argument Extraction Yubo Ma et al.
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- Code4struct: Code Generation For Few-shot Event Structure Prediction Xingyao Wang, Sha Li, Heng Ji
- Llm-planner: Few-shot Grounded Planning For Embodied Agents With Large Language Models Chan Hee Song et al.
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Is GPT-3 A Good Data Annotator? Bosheng Ding et al.
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- Impact Of Pretraining Term Frequencies On Few-shot Reasoning Yasaman Razeghi, Robert L. Iv Logan, Matt Gardner, Sameer Singh
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- Multi-lingual Evaluation Of Code Generation Models Ben Athiwaratkun et al.
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Selection-inference: Exploiting Large Language Models For Interpretable Logical Reasoning Antonia Creswell, Murray Shanahan, Irina Higgins
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Qaner: Prompting Question Answering Models For Few-shot Named Entity Recognition Andy T. Liu et al.
- Making Large Language Models Better Reasoners With Step-aware Verifier Yifei Li et al.
- Large Language Models Are Human-level Prompt Engineers Yongchao Zhou et al.
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Language Models Of Code Are Few-shot Commonsense Learners Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig
- Text And Patterns: For Effective Chain Of Thought, It Takes Two To Tango Aman Madaan, Amir Yazdanbakhsh
- Commonsenseqa 2.0: Exposing The Limits Of AI Through Gamification Alon Talmor et al.
- Can Language Models Learn From Explanations In Context? Andrew K. Lampinen et al.
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- ATTEMPT: Parameter-efficient Multi-task Tuning Via Attentional Mixtures Of Soft Prompts Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- Retrieval-augmented Generative Question Answering For Event Argument Extraction Xinya Du, Heng Ji
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- Dynamic Prompt Learning Via Policy Gradient For Semi-structured Mathematical Reasoning Pan Lu et al.
- Code Generation Tools (almost) For Free? A Study Of Few-shot, Pre-trained Language Models On Code Patrick Bareiß, Beatriz Souza, Marcelo D'amorim, Michael Pradel
- Grounding Language With Visual Affordances Over Unstructured Data Oier Mees, Jessica Borja-diaz, Wolfram Burgard
- Unifiedskg: Unifying And Multi-tasking Structured Knowledge Grounding With Text-to-text Language Models Tianbao Xie et al.
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Challenging Big-bench Tasks And Whether Chain-of-thought Can Solve Them Mirac Suzgun et al.
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- Instructdial: Improving Zero And Few-shot Generalization In Dialogue Through Instruction Tuning Prakhar Gupta et al.
- Quantifying Language Models' Sensitivity To Spurious Features In Prompt Design Or: How I Learned To Start Worrying About Prompt Formatting Melanie Sclar, Yejin Choi, Yulia Tsvetkov, Alane Suhr
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Few-shot Fine-tuning Vs. In-context Learning: A Fair Comparison And Evaluation Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar
- Flexkbqa: A Flexible Llm-powered Framework For Few-shot Knowledge Base Question Answering Zhenyu Li et al.
- Applenet: Visual Attention Parameterized Prompt Learning For Few-shot Remote Sensing Image Generalization Using CLIP Mainak Singha, Ankit Jha, Bhupendra Solanki, Shirsha Bose, Biplab Banerjee
- Enhancing Few-shot Text-to-sql Capabilities Of Large Language Models: A Study On Prompt Design Strategies Linyong Nan et al.
- Query2doc: Query Expansion With Large Language Models Liang Wang, Nan Yang, Furu Wei
- Aligning Instruction Tasks Unlocks Large Language Models As Zero-shot Relation Extractors Kai Zhang, Bernal Jiménez Gutiérrez, Yu Su
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- Increasing Diversity While Maintaining Accuracy: Text Data Generation With Large Language Models And Human Interventions John Joon Young Chung, Ece Kamar, Saleema Amershi
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- Prompt-and-align: Prompt-based Social Alignment For Few-shot Fake News Detection Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- Large Language Model Augmented Narrative Driven Recommendations Sheshera Mysore, Andrew Mccallum, Hamed Zamani
- Mixture-of-experts Meets Instruction Tuning:a Winning Combination For Large Language Models Sheng Shen et al.
- The Flan Collection: Designing Data And Methods For Effective Instruction Tuning Shayne Longpre et al.
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- On Codex Prompt Engineering For OCL Generation: An Empirical Study Seif Abukhalaf, Mohammad Hamdaqa, Foutse Khomh
- Language Is Not All You Need: Aligning Perception With Language Models Shaohan Huang et al.
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- Sabi\'a: Portuguese Large Language Models Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira
- Faithful Chain-of-thought Reasoning Qing Lyu et al.
- Large Language Models Sensitivity To The Order Of Options In Multiple-choice Questions Pouya Pezeshkpour, Estevam Hruschka
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- Chameleon: Plug-and-play Compositional Reasoning With Large Language Models Pan Lu et al.
- Large Language Models Are Built-in Autoregressive Search Engines Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
- Consistency Analysis Of Chatgpt Myeongjun Erik Jang, Thomas Lukasiewicz
- Label Supervised Llama Finetuning Zongxi Li et al.
- DIN-SQL: Decomposed In-context Learning Of Text-to-sql With Self-correction Mohammadreza Pourreza, Davood Rafiei
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- Empirical Study Of Zero-shot NER With Chatgpt Tingyu Xie et al.
- Few-shot In-context Learning For Knowledge Base Question Answering Tianle Li et al.
- Is Chatgpt A Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation Tao Fang et al.
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Analyzing The Performance Of GPT-3.5 And GPT-4 In Grammatical Error Correction Steven Coyne, Keisuke Sakaguchi, Diana Galvan-sosa, Michael Zock, Kentaro Inui
- Pythia: A Suite For Analyzing Large Language Models Across Training And Scaling Stella Biderman et al.
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Wikichat: Stopping The Hallucination Of Large Language Model Chatbots By Few-shot Grounding On Wikipedia Sina J. Semnani, Violet Z. Yao, Heidi C. Zhang, Monica S. Lam
- Do Llms Understand User Preferences? Evaluating Llms On User Rating Prediction Wang-cheng Kang et al.
- Inpars-v2: Large Language Models As Efficient Dataset Generators For Information Retrieval Vitor Jeronymo et al.
- Freshllms: Refreshing Large Language Models With Search Engine Augmentation Tu Vu et al.
- Better Patching Using LLM Prompting, Via Self-consistency Toufique Ahmed, Premkumar Devanbu
- Automatic Semantic Augmentation Of Language Model Prompts (for Code Summarization) Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Auggpt: Leveraging Chatgpt For Text Data Augmentation Haixing Dai et al.
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Language Model Crossover: Variation Through Few-shot Prompting Elliot Meyerson et al.
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- Evaluating Open-domain Question Answering In The Era Of Large Language Models Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- Generative Speech Recognition Error Correction With Large Language Models And Task-activating Prompting Chao-han Huck Yang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- ART: Automatic Multi-step Reasoning And Tool-use For Large Language Models Bhargavi Paranjape et al.
- Large Language Models In The Workplace: A Case Study On Prompt Engineering For Job Type Classification Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, Thomas Brightwell
- Coupling Large Language Models With Logic Programming For Robust And General Reasoning From Text Zhun Yang, Adam Ishay, Joohyung Lee
- Refactoring Programs Using Large Language Models With Few-shot Examples Atsushi Shirafuji, Yusuke Oda, Jun Suzuki, Makoto Morishita, Yutaka Watanobe
- Model Tuning Or Prompt Tuning? A Study Of Large Language Models For Clinical Concept And Relation Extraction Cheng Peng et al.
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- Enhancing Job Recommendation Through Llm-based Generative Adversarial Networks Yingpeng Du et al.
- Improving Factuality And Reasoning In Language Models Through Multiagent Debate Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch
- Human-centric Autonomous Systems With Llms For User Command Reasoning Yi Yang et al.
- Autotamp: Autoregressive Task And Motion Planning With Llms As Translators And Checkers Yongchao Chen et al.
- Specializing Smaller Language Models Towards Multi-step Reasoning Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, Tushar Khot
- Recmind: Large Language Model Powered Agent For Recommendation Yancheng Wang et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Chat With The Environment: Interactive Multimodal Perception Using Large Language Models Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
- Teaching Large Language Models To Self-debug Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou
- How To Unleash The Power Of Large Language Models For Few-shot Relation Extraction? Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- The Unreasonable Effectiveness Of Few-shot Learning For Machine Translation Xavier Garcia et al.
- Sentence Simplification Via Large Language Models Yutao Feng, Jipeng Qiang, Yun Li, Yunhao Yuan, Yi Zhu
- Longbench: A Bilingual, Multitask Benchmark For Long Context Understanding Yushi Bai et al.
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- Large Language Model Capabilities In Perioperative Risk Prediction And Prognostication Philip Chung et al.
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- MM1: Methods, Analysis & Insights From Multimodal LLM Pre-training Brandon Mckinzie et al.
- Llmparser: An Exploratory Study On Using Large Language Models For Log Parsing Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-hsun Chen, Shaowei Wang
- Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
🏷 Fine-Tuning
- Neural Personalized Response Generation As Domain Adaptation Weinan Zhang, Ting Liu, Yifa Wang, Qingfu Zhu
- Fine Grained Knowledge Transfer For Personalized Task-oriented Dialogue Systems Kaixiang Mo, Yu Zhang, Qiang Yang, Pascale Fung
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- Improving Machine Reading Comprehension With General Reading Strategies Kai Sun, Dian Yu, Dong Yu, Claire Cardie
- Can You Tell Me How To Get Past Sesame Street? Sentence-level Pretraining Beyond Language Modeling Alex Wang et al.
- Training Millions Of Personalized Dialogue Agents Pierre-emmanuel Mazaré, Samuel Humeau, Martin Raison, Antoine Bordes
- Zero-shot Adaptive Transfer For Conversational Language Understanding Sungjin Lee, Rahul Jha
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Align, Mask And Select: A Simple Method For Incorporating Commonsense Knowledge Into Language Representation Models Zhi-xiu Ye, Qian Chen, Wen Wang, Zhen-hua Ling
- Olmpics -- On What Language Model Pre-training Captures Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- Domain Adaptive Dialog Generation Via Meta Learning Kun Qian, Zhou Yu
- Language Models As Knowledge Bases? Fabio Petroni et al.
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Multifit: Efficient Multi-lingual Language Model Fine-tuning Julian Martin Eisenschlos et al.
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- Frustratingly Easy Natural Question Answering Lin Pan et al.
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Transfertransfo: A Transfer Learning Approach For Neural Network Based Conversational Agents Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- What Would Elsa Do? Freezing Layers During Transformer Fine-tuning Jaejun Lee, Raphael Tang, Jimmy Lin
- Winogrande: An Adversarial Winograd Schema Challenge At Scale Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Scheduled Sampling For Transformers Tsvetomila Mihaylova, André F. T. Martins
- Paraphrasing With Large Language Models Sam Witteveen, Martin Andrews
- Semantics-aware BERT For Language Understanding Zhuosheng Zhang et al.
- Data Augmentation For BERT Fine-tuning In Open-domain Question Answering Wei Yang et al.
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Inducing Brain-relevant Bias In Natural Language Processing Models Dan Schwartz, Mariya Toneva, Leila Wehbe
- Parameter-efficient Transfer Learning For NLP Neil Houlsby et al.
- Automatic Spanish Translation Of The Squad Dataset For Multilingual Question Answering Casimiro Pio Carrino, Marta R. Costa-jussà, José A. R. Fonollosa
- Visualizing And Understanding The Effectiveness Of BERT Yaru Hao, Li Dong, Furu Wei, Ke Xu
- Do Attention Heads In BERT Track Syntactic Dependencies? Phu Mon Htut, Jason Phang, Shikha Bordia, Samuel R. Bowman
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Fine-tuning Language Models From Human Preferences Daniel M. Ziegler et al.
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Plug And Play Language Models: A Simple Approach To Controlled Text Generation Sumanth Dathathri et al.
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Microsoft Translator At WMT 2019: Towards Large-scale Document-level Neural Machine Translation Marcin Junczys-dowmunt
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Transfer Fine-tuning: A BERT Case Study Yuki Arase, Junichi Tsujii
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- TANDA: Transfer And Adapt Pre-trained Transformer Models For Answer Sentence Selection Siddhant Garg, Thuy Vu, Alessandro Moschitti
- Exploring The Limits Of Transfer Learning With A Unified Text-to-text Transformer Colin Raffel et al.
- Harnessing Evolution Of Multi-turn Conversations For Effective Answer Retrieval Mohammad Aliannejadi, Manajit Chakraborty, Esteban Andrés Ríssola, Fabio Crestani
- Modifying Memories In Transformer Models Chen Zhu et al.
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- Unifiedqa: Crossing Format Boundaries With A Single QA System Daniel Khashabi et al.
- Fine-tuning Pretrained Language Models: Weight Initializations, Data Orders, And Early Stopping Jesse Dodge et al.
- Reducing Gender Bias In Neural Machine Translation As A Domain Adaptation Problem Danielle Saunders, Bill Byrne
- The Chess Transformer: Mastering Play Using Generative Language Models David Noever, Matt Ciolino, Josh Kalin
- Intermediate-task Transfer Learning With Pretrained Models For Natural Language Understanding: When And Why Does It Work? Yada Pruksachatkun et al.
- Speaker-aware BERT For Multi-turn Response Selection In Retrieval-based Chatbots Jia-chen Gu et al.
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- Exploring Fine-tuning Techniques For Pre-trained Cross-lingual Models Via Continual Learning Zihan Liu, Genta Indra Winata, Andrea Madotto, Pascale Fung
- Fine-tuning Pre-trained Language Model With Weak Supervision: A Contrastive-regularized Self-training Approach Yue Yu et al.
- A Knowledge-enhanced Pretraining Model For Commonsense Story Generation Jian Guan, Fei Huang, Zhihao Zhao, Xiaoyan Zhu, Minlie Huang
- Exploring Versatile Generative Language Model Via Parameter-efficient Transfer Learning Zhaojiang Lin, Andrea Madotto, Pascale Fung
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Towards Learning A Generic Agent For Vision-and-language Navigation Via Pre-training Weituo Hao, Chunyuan Li, Xiujun Li, Lawrence Carin, Jianfeng Gao
- Coreferential Reasoning Learning For Language Representation Deming Ye et al.
- A Simple But Tough-to-beat Data Augmentation Approach For Natural Language Understanding And Generation Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- When Being Unseen From Mbert Is Just The Beginning: Handling New Languages With Multilingual Language Models Benjamin Muller, Antonis Anastasopoulos, Benoît Sagot, Djamé Seddah
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- Exploring And Predicting Transferability Across NLP Tasks Tu Vu et al.
- When Do You Need Billions Of Words Of Pretraining Data? Yian Zhang, Alex Warstadt, Haau-sing Li, Samuel R. Bowman
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- A Comparison Of LSTM And BERT For Small Corpus Aysu Ezen-can
- Better Fine-tuning By Reducing Representational Collapse Armen Aghajanyan et al.
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Language Models As Few-shot Learner For Task-oriented Dialogue Systems Andrea Madotto, Zihan Liu, Zhaojiang Lin, Pascale Fung
- Syntactic Data Augmentation Increases Robustness To Inference Heuristics Junghyun Min, R. Thomas Mccoy, Dipanjan Das, Emily Pitler, Tal Linzen
- End-to-end Synthetic Data Generation For Domain Adaptation Of Question Answering Systems Siamak Shakeri et al.
- What Happens To BERT Embeddings During Fine-tuning? Amil Merchant, Elahe Rahimtoroghi, Ellie Pavlick, Ian Tenney
- Recall And Learn: Fine-tuning Deep Pretrained Language Models With Less Forgetting Sanyuan Chen et al.
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Fine-tuning BERT For Schema-guided Zero-shot Dialogue State Tracking Yu-ping Ruan, Zhen-hua Ling, Jia-chen Gu, Quan Liu
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- Automated Source Code Generation And Auto-completion Using Deep Learning: Comparing And Discussing Current Language-model-related Approaches Juan Cruz-benito, Sanjay Vishwakarma, Francisco Martin-fernandez, Ismael Faro
- Retrieval-augmented Generation For Knowledge-intensive NLP Tasks Patrick Lewis et al.
- How Much Knowledge Can You Pack Into The Parameters Of A Language Model? Adam Roberts, Colin Raffel, Noam Shazeer
- Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue Shikib Mehri, Mihail Eric, Dilek Hakkani-tur
- UBAR: Towards Fully End-to-end Task-oriented Dialog Systems With GPT-2 Yunyi Yang, Yunhao Li, Xiaojun Quan
- Language Models Are Few-shot Learners Tom B. Brown et al.
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- How Can We Know When Language Models Know? On The Calibration Of Language Models For Question Answering Zhengbao Jiang, Jun Araki, Haibo Ding, Graham Neubig
- Mintl: Minimalist Transfer Learning For Task-oriented Dialogue Systems Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, Pascale Fung
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Cosda-ml: Multi-lingual Code-switching Data Augmentation For Zero-shot Cross-lingual NLP Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- Pre-training Via Paraphrasing Mike Lewis et al.
- How Fine Can Fine-tuning Be? Learning Efficient Language Models Evani Radiya-dixit, Xin Wang
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Mobilebert: A Compact Task-agnostic BERT For Resource-limited Devices Zhiqing Sun et al.
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- DAVE: Deriving Automatically Verilog From English Hammond Pearce, Benjamin Tan, Ramesh Karri
- Indic-transformers: An Analysis Of Transformer Language Models For Indian Languages Kushal Jain, Adwait Deshpande, Kumar Shridhar, Felix Laumann, Ayushman Dash
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- BERT Based Multilingual Machine Comprehension In English And Hindi Somil Gupta, Nilesh Khade
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- Unsupervised Evaluation Of Interactive Dialog With Dialogpt Shikib Mehri, Maxine Eskenazi
- The Pile: An 800GB Dataset Of Diverse Text For Language Modeling Leo Gao et al.
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- Retrofitting Structure-aware Transformer Language Model For End Tasks Hao Fei, Yafeng Ren, Donghong Ji
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- An Exploratory Study On Long Dialogue Summarization: What Works And What's Next Yusen Zhang et al.
- On Transferability Of Prompt Tuning For Natural Language Processing Yusheng Su et al.
- Truthfulqa: Measuring How Models Mimic Human Falsehoods Stephanie Lin, Jacob Hilton, Owain Evans
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- P-tuning V2: Prompt Tuning Can Be Comparable To Fine-tuning Universally Across Scales And Tasks Xiao Liu et al.
- Learning How To Ask: Querying Lms With Mixtures Of Soft Prompts Guanghui Qin, Jason Eisner
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- Thank You BART! Rewarding Pre-trained Models Improves Formality Style Transfer Huiyuan Lai, Antonio Toral, Malvina Nissim
- Unsupervised Corpus Aware Language Model Pre-training For Dense Passage Retrieval Luyu Gao, Jamie Callan
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Process For Adapting Language Models To Society (PALMS) With Values-targeted Datasets Irene Openai Solaiman, Christy Openai Dennison
- Bitod: A Bilingual Multi-domain Dataset For Task-oriented Dialogue Modeling Zhaojiang Lin et al.
- Revisiting The Primacy Of English In Zero-shot Cross-lingual Transfer Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-wei Chang, Kristina Toutanova
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- Using Prior Knowledge To Guide Bert's Attention In Semantic Textual Matching Tasks Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
- Unipelt: A Unified Framework For Parameter-efficient Language Model Tuning Yuning Mao et al.
- A Recipe For Arbitrary Text Style Transfer With Large Language Models Emily Reif et al.
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- How Should Pre-trained Language Models Be Fine-tuned Towards Adversarial Robustness? Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Arat5: Text-to-text Transformers For Arabic Language Generation El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-mageed
- Efficient Large Scale Language Modeling With Mixtures Of Experts Mikel Artetxe et al.
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Improving And Simplifying Pattern Exploiting Training Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, Colin Raffel
- Compressing Visual-linguistic Model Via Knowledge Distillation Zhiyuan Fang et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- PPT: Pre-trained Prompt Tuning For Few-shot Learning Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang
- Ext5: Towards Extreme Multi-task Scaling For Transfer Learning Vamsi Aribandi et al.
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- Differentially Private Fine-tuning Of Language Models Da Yu et al.
- Long-span Summarization Via Local Attention And Content Selection Potsawee Manakul, Mark J. F. Gales
- What To Pre-train On? Efficient Intermediate Task Selection Clifton Poth, Jonas Pfeiffer, Andreas Rücklé, Iryna Gurevych
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- UNICORN On RAINBOW: A Universal Commonsense Reasoning Model On A New Multitask Benchmark Nicholas Lourie, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- Scale Efficiently: Insights From Pre-training And Fine-tuning Transformers Yi Tay et al.
- Towards Few-shot Fact-checking Via Perplexity Nayeon Lee, Yejin Bang, Andrea Madotto, Madian Khabsa, Pascale Fung
- HTLM: Hyper-text Pre-training And Prompting Of Language Models Armen Aghajanyan et al.
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- Few-shot Bot: Prompt-based Learning For Dialogue Systems Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung
- Revisiting Self-training For Few-shot Learning Of Language Model Yiming Chen et al.
- Few-shot Question Answering By Pretraining Span Selection Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy
- Task-oriented Dialogue System As Natural Language Generation Weizhi Wang et al.
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- Indicbart: A Pre-trained Model For Indic Natural Language Generation Raj Dabre et al.
- Spot: Better Frozen Model Adaptation Through Soft Prompt Transfer Tu Vu, Brian Lester, Noah Constant, Rami Al-rfou, Daniel Cer
- How Many Data Points Is A Prompt Worth? Teven Le Scao, Alexander M. Rush
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- A Simple Recipe For Multilingual Grammatical Error Correction Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause, Aliaksei Severyn
- Scifive: A Text-to-text Transformer Model For Biomedical Literature Long N. Phan et al.
- XTREME-R: Towards More Challenging And Nuanced Multilingual Evaluation Sebastian Ruder et al.
- Sentence-t5: Scalable Sentence Encoders From Pre-trained Text-to-text Models Jianmo Ni et al.
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- Compacter: Efficient Low-rank Hypercomplex Adapter Layers Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder
- Variational Information Bottleneck For Effective Low-resource Fine-tuning Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Program Synthesis With Large Language Models Jacob Austin et al.
- Webgpt: Browser-assisted Question-answering With Human Feedback Reiichiro Nakano et al.
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Adapting Language Models For Zero-shot Learning By Meta-tuning On Dataset And Prompt Collections Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein
- Pretrained Language Models For Text Generation: A Survey Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-rong Wen
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Robertuito: A Pre-trained Language Model For Social Media Text In Spanish Juan Manuel Pérez, Damián A. Furman, Laura Alonso Alemany, Franco Luque
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- On The Effectiveness Of Adapter-based Tuning For Pretrained Language Model Adaptation Ruidan He et al.
- Raise A Child In Large Language Model: Towards Effective And Generalizable Fine-tuning Runxin Xu et al.
- FILM: Following Instructions In Language With Modular Methods So Yeon Min, Devendra Singh Chaplot, Pradeep Ravikumar, Yonatan Bisk, Ruslan Salakhutdinov
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Webshop: Towards Scalable Real-world Web Interaction With Grounded Language Agents Shunyu Yao, Howard Chen, John Yang, Karthik Narasimhan
- Coderl: Mastering Code Generation Through Pretrained Models And Deep Reinforcement Learning Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- Perturbation Augmentation For Fairer NLP Rebecca Qian et al.
- Vision-and-language Pretrained Models: A Survey Siqu Long, Feiqi Cao, Soyeon Caren Han, Haiqin Yang
- Few-shot Parameter-efficient Fine-tuning Is Better And Cheaper Than In-context Learning Haokun Liu et al.
- Rethinking With Retrieval: Faithful Large Language Model Inference Hangfeng He, Hongming Zhang, Dan Roth
- Prototypical Verbalizer For Prompt-based Few-shot Tuning Ganqu Cui, Shengding Hu, Ning Ding, Longtao Huang, Zhiyuan Liu
- Data Augmentation For Intent Classification With Off-the-shelf Large Language Models Gaurav Sahu et al.
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- On The Transferability Of Pre-trained Language Models For Low-resource Programming Languages Fuxiang Chen, Fatemeh Fard, David Lo, Timofey Bryksin
- Chatgpt Makes Medicine Easy To Swallow: An Exploratory Case Study On Simplified Radiology Reports Katharina Jeblick et al.
- Healthprompt: A Zero-shot Learning Paradigm For Clinical Natural Language Processing Sonish Sivarajkumar, Yanshan Wang
- Speechprompt: An Exploration Of Prompt Tuning On Generative Spoken Language Model For Speech Processing Tasks Kai-wei Chang, Wei-cheng Tseng, Shang-wen Li, Hung-yi Lee
- Deepspeed-moe: Advancing Mixture-of-experts Inference And Training To Power Next-generation AI Scale Samyam Rajbhandari et al.
- Training Compute-optimal Large Language Models Jordan Hoffmann et al.
- OPT-IML: Scaling Language Model Instruction Meta Learning Through The Lens Of Generalization Srinivasan Iyer et al.
- Towards Trustworthy Autograding Of Short, Multi-lingual, Multi-type Answers Johannes Schneider, Robin Richner, Micha Riser
- News Summarization And Evaluation In The Era Of GPT-3 Tanya Goyal, Junyi Jessy Li, Greg Durrett
- Controllable Natural Language Generation With Contrastive Prefixes Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen
- Unified-io: A Unified Model For Vision, Language, And Multi-modal Tasks Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Improving The Domain Adaptation Of Retrieval Augmented Generation (RAG) Models For Open Domain Question Answering Shamane Siriwardhana et al.
- Large Language Models Can Self-improve Jiaxin Huang et al.
- Adapting Pre-trained Language Models To African Languages Via Multilingual Adaptive Fine-tuning Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow
- Benchmarking Large Language Models For Automated Verilog RTL Code Generation Shailja Thakur et al.
- Convfinqa: Exploring The Chain Of Numerical Reasoning In Conversational Finance Question Answering Zhiyu Chen et al.
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Instructionner: A Multi-task Instruction-based Generative Framework For Few-shot NER Liwen Wang et al.
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Real Or Fake Text?: Investigating Human Ability To Detect Boundaries Between Human-written And Machine-generated Text Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Sherry Shi, Chris Callison-burch
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- Efficient Few-shot Learning Without Prompts Lewis Tunstall et al.
- The Goldilocks Of Pragmatic Understanding: Fine-tuning Strategy Matters For Implicature Resolution By Llms Laura Ruis et al.
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- Who Is GPT-3? An Exploration Of Personality, Values And Demographics Marilù Miotto, Nicola Rossberg, Bennett Kleinberg
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- Visual Prompt Tuning Menglin Jia et al.
- Legal Prompting: Teaching A Language Model To Think Like A Lawyer Fangyi Yu, Lee Quartey, Frank Schilder
- Star: Bootstrapping Reasoning With Reasoning Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman
- Compilable Neural Code Generation With Compiler Feedback Xin Wang et al.
- IGLUE: A Benchmark For Transfer Learning Across Modalities, Tasks, And Languages Emanuele Bugliarello et al.
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- Quark: Controllable Text Generation With Reinforced Unlearning Ximing Lu et al.
- Legal Prompt Engineering For Multilingual Legal Judgement Prediction Dietrich Trautmann, Alina Petrova, Frank Schilder
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- Visual-language Navigation Pretraining Via Prompt-based Environmental Self-exploration Xiwen Liang, Fengda Zhu, Lingling Li, Hang Xu, Xiaodan Liang
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- Factpegasus: Factuality-aware Pre-training And Fine-tuning For Abstractive Summarization David Wan, Mohit Bansal
- Language And Culture Internalisation For Human-like Autotelic AI Cédric Colas, Tristan Karch, Clément Moulin-frier, Pierre-yves Oudeyer
- Unified Vision And Language Prompt Learning Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
- LAION-5B: An Open Large-scale Dataset For Training Next Generation Image-text Models Christoph Schuhmann et al.
- Noisytune: A Little Noise Can Help You Finetune Pretrained Language Models Better Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang, Xing Xie
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- IDPG: An Instance-dependent Prompt Generation Method Zhuofeng Wu et al.
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- St-moe: Designing Stable And Transferable Sparse Expert Models Barret Zoph et al.
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- Generative Language Models For Paragraph-level Question Generation Asahi Ushio, Fernando Alva-manchego, Jose Camacho-collados
- Selection-inference: Exploiting Large Language Models For Interpretable Logical Reasoning Antonia Creswell, Murray Shanahan, Irina Higgins
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- Standing On The Shoulders Of Giant Frozen Language Models Yoav Levine et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- ATTEMPT: Parameter-efficient Multi-task Tuning Via Attentional Mixtures Of Soft Prompts Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- Language Models Are Greedy Reasoners: A Systematic Formal Analysis Of Chain-of-thought Abulhair Saparov, He He
- No Language Left Behind: Scaling Human-centered Machine Translation Nllb Team et al.
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Crosslingual Generalization Through Multitask Finetuning Niklas Muennighoff et al.
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Dylora: Parameter Efficient Tuning Of Pre-trained Models Using Dynamic Search-free Low-rank Adaptation Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, Ali Ghodsi
- KALA: Knowledge-augmented Language Model Adaptation Minki Kang, Jinheon Baek, Sung Ju Hwang
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Meta Policy Learning For Cold-start Conversational Recommendation Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Can Llms Express Their Uncertainty? An Empirical Evaluation Of Confidence Elicitation In Llms Miao Xiong et al.
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- Few-shot Fine-tuning Vs. In-context Learning: A Fair Comparison And Evaluation Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Human-ai Collaboration In Thematic Analysis Using Chatgpt: A User Study And Design Recommendations Lixiang Yan et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Taiyi: A Bilingual Fine-tuned Large Language Model For Diverse Biomedical Tasks Ling Luo et al.
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Improving Text Embeddings With Large Language Models Liang Wang et al.
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- In-context Impersonation Reveals Large Language Models' Strengths And Biases Leonard Salewski, Stephan Alaniz, Isabel Rio-torto, Eric Schulz, Zeynep Akata
- Query2doc: Query Expansion With Large Language Models Liang Wang, Nan Yang, Furu Wei
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Dissociating Language And Thought In Large Language Models Kyle Mahowald et al.
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- Just Tell Me: Prompt Engineering In Business Process Management Kiran Busch, Alexander Rochlitzer, Diana Sola, Henrik Leopold
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Aligning Instruction Tasks Unlocks Large Language Models As Zero-shot Relation Extractors Kai Zhang, Bernal Jiménez Gutiérrez, Yu Su
- Full Parameter Fine-tuning For Large Language Models With Limited Resources Kai Lv et al.
- Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, Chun Zuo
- Exploring The Benefits Of Training Expert Language Models Over Instruction Tuning Joel Jang et al.
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- Ureader: Universal Ocr-free Visually-situated Language Understanding With Multimodal Large Language Model Jiabo Ye et al.
- VILA: On Pre-training For Visual Language Models Ji Lin et al.
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- Physically Grounded Vision-language Models For Robotic Manipulation Jensen Gao et al.
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- Chainforge: A Visual Toolkit For Prompt Engineering And LLM Hypothesis Testing Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Martin Wattenberg, Elena Glassman
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- A Comprehensive Overview Of Large Language Models Humza Naveed et al.
- Llama 2: Open Foundation And Fine-tuned Chat Models Hugo Touvron et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Doctorglm: Fine-tuning Your Chinese Doctor Is Not A Herculean Task Honglin Xiong et al.
- Bioinstruct: Instruction Tuning Of Large Language Models For Biomedical Natural Language Processing Hieu Tran, Zhichao Yang, Zonghai Yao, Hong Yu
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- Self-chained Image-language Model For Video Localization And Question Answering Shoubin Yu, Jaemin Cho, Prateek Yadav, Mohit Bansal
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Instruction Tuning For Large Language Models: A Survey Shengyu Zhang et al.
- Multitask Prompt Tuning Enables Parameter-efficient Transfer Learning Zhen Wang et al.
- Why Does Chatgpt Fall Short In Providing Truthful Answers? Shen Zheng, Jie Huang, Kevin Chen-chuan Chang
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- Luminate: Structured Generation And Exploration Of Design Space With Large Language Models For Human-ai Co-creation Sangho Suh, Meng Chen, Bryan Min, Toby Jia-jun Li, Haijun Xia
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- Does Synthetic Data Generation Of Llms Help Clinical Text Mining? Ruixiang Tang, Xiaotian Han, Xiaoqian Jiang, Xia Hu
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Pro-cap: Leveraging A Frozen Vision-language Model For Hateful Meme Detection Rui Cao et al.
- Scalable Educational Question Generation With Pre-trained Language Models Sahan Bulathwela, Hamze Muse, Emine Yilmaz
- Prompt-based Distribution Alignment For Unsupervised Domain Adaptation Shuanghao Bai et al.
- Lawyer Llama Technical Report Quzhe Huang et al.
- Direct Preference Optimization: Your Language Model Is Secretly A Reward Model Rafael Rafailov et al.
- Mplug-owl: Modularization Empowers Large Language Models With Multimodality Qinghao Ye et al.
- Adalora: Adaptive Budget Allocation For Parameter-efficient Fine-tuning Qingru Zhang et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Graphologue: Exploring Large Language Model Responses With Interactive Diagrams Peiling Jiang, Jude Rayan, Steven P. Dow, Haijun Xia
- Fine-tuning Or Retrieval? Comparing Knowledge Injection In Llms Oded Ovadia, Menachem Brief, Moshik Mishaeli, Oren Elisha
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- Sparse Low-rank Adaptation Of Pre-trained Language Models Ning Ding et al.
- Self-regulating Prompts: Foundational Model Adaptation Without Forgetting Muhammad Uzair Khattak et al.
- Label Supervised Llama Finetuning Zongxi Li et al.
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- Abscribe: Rapid Exploration & Organization Of Multiple Writing Variations In Human-ai Co-writing Tasks Using Large Language Models Mohi Reza et al.
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Medalpaca -- An Open-source Collection Of Medical Conversational AI Models And Training Data Tianyu Han et al.
- Large Language Model Alignment: A Survey Tianhao Shen et al.
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Open-ended Medical Visual Question Answering Through Prefix Tuning Of Language Models Tom Van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Sparks Of Artificial General Intelligence: Early Experiments With GPT-4 Sébastien Bubeck et al.
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Llm-empowered Chatbots For Psychiatrist And Patient Simulation: Application And Evaluation Siyuan Chen et al.
- Can A Student Large Language Model Perform As Well As It's Teacher? Sia Gholami, Marwan Omar
- Tree Of Thoughts: Deliberate Problem Solving With Large Language Models Shunyu Yao et al.
- Timechat: A Time-sensitive Multimodal Large Language Model For Long Video Understanding Shuhuai Ren, Linli Yao, Shicheng Li, Xu Sun, Lu Hou
- Do Llms Understand User Preferences? Evaluating Llms On User Rating Prediction Wang-cheng Kang et al.
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Scaling Down To Scale Up: A Guide To Parameter-efficient Fine-tuning Vladislav Lialin, Vijeta Deshpande, Xiaowei Yao, Anna Rumshisky
- Evaluating Correctness And Faithfulness Of Instruction-following Models For Question Answering Vaibhav Adlakha, Parishad Behnamghader, Xing Han Lu, Nicholas Meade, Siva Reddy
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Promptcblue: A Chinese Prompt Tuning Benchmark For The Medical Domain Wei Zhu, Xiaoling Wang, Huanran Zheng, Mosha Chen, Buzhou Tang
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- Large Language Models As Zero-shot Conversational Recommenders Zhankui He et al.
- Promptify: Text-to-image Generation Through Interactive Prompt Exploration With Large Language Models Stephen Brade, Bryan Wang, Mauricio Sousa, Sageev Oore, Tovi Grossman
- Chatgpt For PLC/DCS Control Logic Generation Heiko Koziolek, Sten Gruener, Virendra Ashiwal
- Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine Harsha Nori et al.
- Extractive Summarization Via Chatgpt For Faithful Summary Generation Haopeng Zhang, Xiao Liu, Jiawei Zhang
- Autodroid: Llm-powered Task Automation In Android Hao Wen et al.
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Wizardmath: Empowering Mathematical Reasoning For Large Language Models Via Reinforced Evol-instruct Haipeng Luo et al.
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Lawbench: Benchmarking Legal Knowledge Of Large Language Models Zhiwei Fei et al.
- Navgpt: Explicit Reasoning In Vision-and-language Navigation With Large Language Models Gengze Zhou, Yicong Hong, Qi Wu
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Towards Efficient Fine-tuning Of Pre-trained Code Models: An Experimental Study And Beyond Ensheng Shi et al.
- Lasuie: Unifying Information Extraction With Latent Adaptive Structure-aware Generative Language Model Hao Fei et al.
- Llm-adapters: An Adapter Family For Parameter-efficient Fine-tuning Of Large Language Models Zhiqiang Hu et al.
- Speechgpt: Empowering Large Language Models With Intrinsic Cross-modal Conversational Abilities Dong Zhang et al.
- MELTR: Meta Loss Transformer For Learning To Fine-tune Video Foundation Models Dohwan Ko et al.
- The Vector Grounding Problem Dimitri Coelho Mollo, Raphaël Millière
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- One Adapter For All Programming Languages? Adapter Tuning For Code Search And Summarization Deze Wang et al.
- Large Language Models For Generative Information Extraction: A Survey Derong Xu et al.
- Fine-tuning Chatgpt For Automatic Scoring Ehsan Latif, Xiaoming Zhai
- Text-to-sql Empowered By Large Language Models: A Benchmark Evaluation Dawei Gao et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Multimodal Foundation Models: From Specialists To General-purpose Assistants Chunyuan Li et al.
- A Study On The Implementation Of Generative AI Services Using An Enterprise Data-based LLM Application Architecture Cheonsu Jeong
- Llm-powered Data Augmentation For Enhanced Cross-lingual Performance Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Generative Speech Recognition Error Correction With Large Language Models And Task-activating Prompting Chao-han Huck Yang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- Wizardlm: Empowering Large Language Models To Follow Complex Instructions Can Xu et al.
- Compositional Chain-of-thought Prompting For Large Multimodal Models Chancharik Mitra, Brandon Huang, Trevor Darrell, Roei Herzig
- Pmc-llama: Towards Building Open-source Language Models For Medicine Chaoyi Wu et al.
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- Prompting Or Fine-tuning? A Comparative Study Of Large Language Models For Taxonomy Construction Boqi Chen, Fandi Yi, Dániel Varró
- Prompting Large Language Model For Machine Translation: A Case Study Biao Zhang, Barry Haddow, Alexandra Birch
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Model Tuning Or Prompt Tuning? A Study Of Large Language Models For Clinical Concept And Relation Extraction Cheng Peng et al.
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- On Learning To Summarize With Large Language Models As References Yixin Liu et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- How Far Can Camels Go? Exploring The State Of Instruction Tuning On Open Resources Yizhong Wang et al.
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- Low-rank Adaptation Of Large Language Model Rescoring For Parameter-efficient Speech Recognition Yu Yu et al.
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Fine-tuning Llama For Multi-stage Text Retrieval Xueguang Ma, Liang Wang, Nan Yang, Furu Wei, Jimmy Lin
- Improving Language Model Negotiation With Self-play And In-context Learning From AI Feedback Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata
- Llm-pruner: On The Structural Pruning Of Large Language Models Xinyin Ma, Gongfan Fang, Xinchao Wang
- PMC-VQA: Visual Instruction Tuning For Medical Visual Question Answering Xiaoman Zhang et al.
- LISA: Reasoning Segmentation Via Large Language Model Xin Lai et al.
- Xuanyuan 2.0: A Large Chinese Financial Chat Model With Hundreds Of Billions Parameters Xuanyu Zhang, Qing Yang, Dongliang Xu
- HPC-GPT: Integrating Large Language Model For High-performance Computing Xianzhong Ding et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Beyond Chatbots: Explorellm For Structured Thoughts And Personalized Model Responses Xiao Ma et al.
- Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events Woosuk Seo, Chanmo Yang, Young-ho Kim
- Large Language Models In Education: Vision And Opportunities Wensheng Gan, Zhenlian Qi, Jiayang Wu, Jerry Chun-wei Lin
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- Longbench: A Bilingual, Multitask Benchmark For Long Context Understanding Yushi Bai et al.
- Editing Large Language Models: Problems, Methods, And Opportunities Yunzhi Yao et al.
- Chatdoctor: A Medical Chat Model Fine-tuned On A Large Language Model Meta-ai (llama) Using Medical Domain Knowledge Yunxiang Li et al.
- Describe, Explain, Plan And Select: Interactive Planning With Large Language Models Enables Open-world Multi-task Agents Zihao Wang et al.
- An Empirical Study Of Catastrophic Forgetting In Large Language Models During Continual Fine-tuning Yun Luo et al.
- Exploring The Impact Of Instruction Data Scaling On Large Language Models: An Empirical Study On Real-world Use Cases Yunjie Ji et al.
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Aligning Large Language Models With Human: A Survey Yufei Wang et al.
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Large Language Models In Healthcare And Medical Domain: A Review Zabir Al Nazi, Wei Peng
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- Adaptmllm: Fine-tuning Multilingual Language Models On Low-resource Languages With Integrated LLM Playgrounds Séamus Lankford, Haithem Afli, Andy Way
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- Me Llama: Foundation Large Language Models For Medical Applications Qianqian Xie et al.
- Jamba: A Hybrid Transformer-mamba Language Model Opher Lieber et al.
- Fine-tuned Language Models Generate Stable Inorganic Materials As Text Nate Gruver et al.
- Supporting Sensemaking Of Large Language Model Outputs At Scale Katy Ilonka Gero, Chelse Swoopes, Ziwei Gu, Jonathan K. Kummerfeld, Elena L. Glassman
- Data Is All You Need: Finetuning Llms For Chip Design Via An Automated Design-data Augmentation Framework Kaiyan Chang et al.
- ORPO: Monolithic Preference Optimization Without Reference Model Jiwoo Hong, Noah Lee, James Thorne
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Fine Tuning Vs. Retrieval Augmented Generation For Less Popular Knowledge Heydar Soudani, Evangelos Kanoulas, Faegheh Hasibi
- Closing The Gap Between Open-source And Commercial Large Language Models For Medical Evidence Summarization Gongbo Zhang et al.
- Materials Science In The Era Of Large Language Models: A Perspective Ge Lei, Ronan Docherty, Samuel J. Cooper
- Embedding Large Language Models Into Extended Reality: Opportunities And Challenges For Inclusion, Engagement, And Privacy Efe Bozkir et al.
- Ai-tutoring In Software Engineering Education Eduard Frankford, Clemens Sauerwein, Patrick Bassner, Stephan Krusche, Ruth Breu
- Chemllm: A Chemical Large Language Model Di Zhang et al.
- Deepseek-v2: A Strong, Economical, And Efficient Mixture-of-experts Language Model Deepseek-ai et al.
- Understanding Large-language Model (llm)-powered Human-robot Interaction Callie Y. Kim, Christine P. Lee, Bilge Mutlu
- MM1: Methods, Analysis & Insights From Multimodal LLM Pre-training Brandon Mckinzie et al.
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- Does Fine-tuning Llms On New Knowledge Encourage Hallucinations? Zorik Gekhman et al.
- Harnessing Large Language Models For Text-rich Sequential Recommendation Zhi Zheng, Wenshuo Chao, Zhaopeng Qiu, Hengshu Zhu, Hui Xiong
- Llmparser: An Exploratory Study On Using Large Language Models For Log Parsing Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-hsun Chen, Shaowei Wang
- A Survey On Lora Of Large Language Models Yuren Mao et al.
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Llamafactory: Unified Efficient Fine-tuning Of 100+ Language Models Yaowei Zheng et al.
- Datasets For Large Language Models: A Comprehensive Survey Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Can Generative Llms Create Query Variants For Test Collections? An Exploratory Study Marwah Alaofi, Luke Gallagher, Mark Sanderson, Falk Scholer, Paul Thomas
- Deepseek-r1: Incentivizing Reasoning Capability In Llms Via Reinforcement Learning Deepseek-ai et al.
🏷 GPT
- Maskgan: Better Text Generation Via Filling In The______ William Fedus, Ian Goodfellow, Andrew M. Dai
- Robust Text-to-sql Generation With Execution-guided Decoding Chenglong Wang et al.
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Universal Adversarial Triggers For Attacking And Analyzing NLP Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh
- Non-autoregressive Transformer By Position Learning Yu Bao et al.
- Encode, Tag, Realize: High-precision Text Editing Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Language Modeling With Deep Transformers Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
- Zero: Memory Optimizations Toward Training Trillion Parameter Models Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He
- Gpt-based Generation For Classical Chinese Poetry Yi Liao, Yasheng Wang, Qun Liu, Xin Jiang
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Insertion-based Decoding With Automatically Inferred Generation Order Jiatao Gu, Qi Liu, Kyunghyun Cho
- Levenshtein Transformer Jiatao Gu, Changhan Wang, Jake Zhao
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Paraphrasing With Large Language Models Sam Witteveen, Martin Andrews
- Semantics-aware BERT For Language Understanding Zhuosheng Zhang et al.
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Dialogpt: Large-scale Generative Pre-training For Conversational Response Generation Yizhe Zhang et al.
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- Do Massively Pretrained Language Models Make Better Storytellers? Abigail See, Aneesh Pappu, Rohun Saxena, Akhila Yerukola, Christopher D. Manning
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- Unsupervised Paraphrase Generation Using Pre-trained Language Models Chaitra Hegde, Shrikumar Patil
- Progressive Generation Of Long Text With Pretrained Language Models Bowen Tan, Zichao Yang, Maruan Ai-shedivat, Eric P. Xing, Zhiting Hu
- Artificial Intelligence Versus Maya Angelou: Experimental Evidence That People Cannot Differentiate Ai-generated From Human-written Poetry Nils Köbis, Luca Mossink
- The Radicalization Risks Of GPT-3 And Advanced Neural Language Models Kris Mcguffie, Alex Newhouse
- The Chess Transformer: Mastering Play Using Generative Language Models David Noever, Matt Ciolino, Josh Kalin
- Pymt5: Multi-mode Translation Of Natural Language And Python Code With Transformers Colin B. Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, Neel Sundaresan
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- Knowledge-aware Language Model Pretraining Corby Rosset et al.
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- A Knowledge-enhanced Pretraining Model For Commonsense Story Generation Jian Guan, Fei Huang, Zhihao Zhao, Xiaoyan Zhu, Minlie Huang
- Gpt-too: A Language-model-first Approach For Amr-to-text Generation Manuel Mager et al.
- Few-shot Generative Conversational Query Rewriting Shi Yu et al.
- EDITOR: An Edit-based Transformer With Repositioning For Neural Machine Translation With Soft Lexical Constraints Weijia Xu, Marine Carpuat
- BANG: Bridging Autoregressive And Non-autoregressive Generation With Large Scale Pretraining Weizhen Qi et al.
- It's Not Just Size That Matters: Small Language Models Are Also Few-shot Learners Timo Schick, Hinrich Schütze
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- A Simple Language Model For Task-oriented Dialogue Ehsan Hosseini-asl, Bryan Mccann, Chien-sheng Wu, Semih Yavuz, Richard Socher
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- Gedi: Generative Discriminator Guided Sequence Generation Ben Krause et al.
- Few-shot Natural Language Generation For Task-oriented Dialog Baolin Peng et al.
- Genaug: Data Augmentation For Finetuning Text Generators Steven Y. Feng, Varun Gangal, Dongyeop Kang, Teruko Mitamura, Eduard Hovy
- A Large-scale Chinese Short-text Conversation Dataset Yida Wang et al.
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Language Models As Few-shot Learner For Task-oriented Dialogue Systems Andrea Madotto, Zihan Liu, Zhaojiang Lin, Pascale Fung
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- Non-autoregressive Machine Translation With Disentangled Context Transformer Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- Realtoxicityprompts: Evaluating Neural Toxic Degeneration In Language Models Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, Noah A. Smith
- Simplifying Paragraph-level Question Generation Via Transformer Language Models Luis Enrico Lopez, Diane Kathryn Cruz, Jan Christian Blaise Cruz, Charibeth Cheng
- UBAR: Towards Fully End-to-end Task-oriented Dialog Systems With GPT-2 Yunyi Yang, Yunhao Li, Xiaojun Quan
- Narrative Interpolation For Generating And Understanding Stories Su Wang, Greg Durrett, Katrin Erk
- Language Models Are Few-shot Learners Tom B. Brown et al.
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- How Can We Know When Language Models Know? On The Calibration Of Language Models For Question Answering Zhengbao Jiang, Jun Araki, Haibo Ding, Graham Neubig
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- Variational Transformers For Diverse Response Generation Zhaojiang Lin, Genta Indra Winata, Peng Xu, Zihan Liu, Pascale Fung
- Turngpt: A Transformer-based Language Model For Predicting Turn-taking In Spoken Dialog Erik Ekstedt, Gabriel Skantze
- CERT: Contrastive Self-supervised Learning For Language Understanding Hongchao Fang, Sicheng Wang, Meng Zhou, Jiayuan Ding, Pengtao Xie
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Aragpt2: Pre-trained Transformer For Arabic Language Generation Wissam Antoun, Fady Baly, Hazem Hajj
- Training Question Answering Models From Synthetic Data Raul Puri, Ryan Spring, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- DAVE: Deriving Automatically Verilog From English Hammond Pearce, Benjamin Tan, Ramesh Karri
- Plotmachines: Outline-conditioned Generation With Dynamic Plot State Tracking Hannah Rashkin, Asli Celikyilmaz, Yejin Choi, Jianfeng Gao
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Emptransfo: A Multi-head Transformer Architecture For Creating Empathetic Dialog Systems Rohola Zandie, Mohammad H. Mahoor
- Trojaning Language Models For Fun And Profit Xinyang Zhang, Zheng Zhang, Shouling Ji, Ting Wang
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- Unsupervised Evaluation Of Interactive Dialog With Dialogpt Shikib Mehri, Maxine Eskenazi
- Data Augmentation Using Pre-trained Transformer Models Varun Kumar, Ashutosh Choudhary, Eunah Cho
- The Pile: An 800GB Dataset Of Diverse Text For Language Modeling Leo Gao et al.
- As Good As New. How To Successfully Recycle English GPT-2 To Make Models For Other Languages Wietse De Vries, Malvina Nissim
- Few-shot Learning With Multilingual Language Models Xi Victoria Lin et al.
- Bias Out-of-the-box: An Empirical Analysis Of Intersectional Occupational Biases In Popular Generative Language Models Hannah Kirk et al.
- Truthfulqa: Measuring How Models Mimic Human Falsehoods Stephanie Lin, Jacob Hilton, Owain Evans
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- GPT Understands, Too Xiao Liu et al.
- Language Model As An Annotator: Exploring Dialogpt For Dialogue Summarization Xiachong Feng, Xiaocheng Feng, Libo Qin, Bing Qin, Ting Liu
- Thinking Aloud: Dynamic Context Generation Improves Zero-shot Reasoning Performance Of GPT-2 Gregor Betz, Kyle Richardson, Christian Voigt
- Language Models Are Few-shot Multilingual Learners Genta Indra Winata et al.
- Reframing Instructional Prompts To Gptk's Language Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- Transformer-based Conditional Variational Autoencoder For Controllable Story Generation Le Fang et al.
- Prompt Programming For Large Language Models: Beyond The Few-shot Paradigm Laria Reynolds, Kyle Mcdonell
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Thank You BART! Rewarding Pre-trained Models Improves Formality Style Transfer Huiyuan Lai, Antonio Toral, Malvina Nissim
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Entailment As Few-shot Learner Sinong Wang, Han Fang, Madian Khabsa, Hanzi Mao, Hao Ma
- Mitigating Political Bias In Language Models Through Reinforced Calibration Ruibo Liu et al.
- Process For Adapting Language Models To Society (PALMS) With Values-targeted Datasets Irene Openai Solaiman, Christy Openai Dennison
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- GPT-3 Models Are Poor Few-shot Learners In The Biomedical Domain Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald
- Revealing Persona Biases In Dialogue Systems Emily Sheng, Josh Arnold, Zhou Yu, Kai-wei Chang, Nanyun Peng
- All That's 'human' Is Not Gold: Evaluating Human Evaluation Of Generated Text Elizabeth Clark et al.
- Efficient Large Scale Language Modeling With Mixtures Of Experts Mikel Artetxe et al.
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- The Impact Of Multiple Parallel Phrase Suggestions On Email Input And Composition Behaviour Of Native And Non-native English Writers Daniel Buschek, Martin Zürn, Malin Eiband
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- Differentially Private Fine-tuning Of Language Models Da Yu et al.
- The Stability-efficiency Dilemma: Investigating Sequence Length Warmup For Training GPT Models Conglong Li, Minjia Zhang, Yuxiong He
- MAGMA -- Multimodal Augmentation Of Generative Models Through Adapter-based Finetuning Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Glam: Efficient Scaling Of Language Models With Mixture-of-experts Nan Du et al.
- A Token-level Reference-free Hallucination Detection Benchmark For Free-form Text Generation Tianyu Liu et al.
- Terapipe: Token-level Pipeline Parallelism For Training Large-scale Language Models Zhuohan Li et al.
- RAFT: A Real-world Few-shot Text Classification Benchmark Neel Alex et al.
- Fantastically Ordered Prompts And Where To Find Them: Overcoming Few-shot Prompt Order Sensitivity Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- The Power Of Scale For Parameter-efficient Prompt Tuning Brian Lester, Rami Al-rfou, Noah Constant
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- Medically Aware GPT-3 As A Data Generator For Medical Dialogue Summarization Bharath Chintagunta, Namit Katariya, Xavier Amatriain, Anitha Kannan
- Maria: Spanish Language Models Asier Gutiérrez-fandiño et al.
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- GLM: General Language Model Pretraining With Autoregressive Blank Infilling Zhengxiao Du et al.
- Demix Layers: Disentangling Domains For Modular Language Modeling Suchin Gururangan, Mike Lewis, Ari Holtzman, Noah A. Smith, Luke Zettlemoyer
- Few-shot Bot: Prompt-based Learning For Dialogue Systems Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung
- General-purpose Question-answering With Macaw Oyvind Tafjord, Peter Clark
- An Empirical Study Of GPT-3 For Few-shot Knowledge-based VQA Zhengyuan Yang et al.
- Task-oriented Dialogue System As Natural Language Generation Weizhi Wang et al.
- Understanding The Capabilities, Limitations, And Societal Impact Of Large Language Models Alex Tamkin, Miles Brundage, Jack Clark, Deep Ganguli
- Dexperts: Decoding-time Controlled Text Generation With Experts And Anti-experts Alisa Liu et al.
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- MT6: Multilingual Pretrained Text-to-text Transformer With Translation Pairs Zewen Chi et al.
- One Question Answering Model For Many Languages With Cross-lingual Dense Passage Retrieval Akari Asai, Xinyan Yu, Jungo Kasai, Hannaneh Hajishirzi
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Codexglue: A Machine Learning Benchmark Dataset For Code Understanding And Generation Shuai Lu et al.
- Pangu-\(α\): Large-scale Autoregressive Pretrained Chinese Language Models With Auto-parallel Computation Wei Zeng et al.
- What Makes Good In-context Examples For GPT-\(3\)? Jiachang Liu et al.
- Recursively Summarizing Books With Human Feedback Jeff Wu et al.
- Evaluating Large Language Models Trained On Code Mark Chen et al.
- Finetuned Language Models Are Zero-shot Learners Jason Wei et al.
- How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty In Text Generation Using RAVEN R. Thomas Mccoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz
- Webgpt: Browser-assisted Question-answering With Human Feedback Reiichiro Nakano et al.
- Redditbias: A Real-world Resource For Bias Evaluation And Debiasing Of Conversational Language Models Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Adapting Language Models For Zero-shot Learning By Meta-tuning On Dataset And Prompt Collections Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Indonlg: Benchmark And Resources For Evaluating Indonesian Natural Language Generation Samuel Cahyawijaya et al.
- Reframing Human-ai Collaboration For Generating Free-text Explanations Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Evaluating Mixed-initiative Conversational Search Systems Via User Simulation Ivan Sekulić, Mohammad Aliannejadi, Fabio Crestani
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- Rethinking The Role Of Demonstrations: What Makes In-context Learning Work? Sewon Min et al.
- On The Effect Of Pretraining Corpora On In-context Learning By A Large-scale Language Model Seongjin Shin et al.
- Demystifying Prompts In Language Models Via Perplexity Estimation Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer
- Interleaving Retrieval With Chain-of-thought Reasoning For Knowledge-intensive Multi-step Questions Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- Cogvideo: Large-scale Pretraining For Text-to-video Generation Via Transformers Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang
- Rethinking With Retrieval: Faithful Large Language Model Inference Hangfeng He, Hongming Zhang, Dan Roth
- The Unreliability Of Explanations In Few-shot Prompting For Textual Reasoning Xi Ye, Greg Durrett
- Diffusion-lm Improves Controllable Text Generation Xiang Lisa Li, John Thickstun, Ishaan Gulrajani, Percy Liang, Tatsunori B. Hashimoto
- Using Large Language Models To Simulate Multiple Humans And Replicate Human Subject Studies Gati Aher, Rosa I. Arriaga, Adam Tauman Kalai
- Data Augmentation For Intent Classification With Off-the-shelf Large Language Models Gaurav Sahu et al.
- Promptcap: Prompt-guided Task-aware Image Captioning Yushi Hu et al.
- Contrastive Decoding: Open-ended Text Generation As Optimization Xiang Lisa Li et al.
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored To Political Identity Gabriel Simmons
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Mass-editing Memory In A Transformer Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Chatgpt Makes Medicine Easy To Swallow: An Exploratory Case Study On Simplified Radiology Reports Katharina Jeblick et al.
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Teaching Models To Express Their Uncertainty In Words Stephanie Lin, Jacob Hilton, Owain Evans
- Alexatm 20B: Few-shot Learning Using A Large-scale Multilingual Seq2seq Model Saleh Soltan et al.
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Leveraging Large Language Models For Multiple Choice Question Answering Joshua Robinson, Christopher Michael Rytting, David Wingate
- Training Compute-optimal Large Language Models Jordan Hoffmann et al.
- Do Language Models Plagiarize? Jooyoung Lee, Thai Le, Jinghui Chen, Dongwon Lee
- Can Large Language Models Truly Understand Prompts? A Case Study With Negated Prompts Joel Jang, Seonghyeon Ye, Minjoon Seo
- News Summarization And Evaluation In The Era Of GPT-3 Tanya Goyal, Junyi Jessy Li, Greg Durrett
- Do Large Language Models Know What Humans Know? Sean Trott, Cameron Jones, Tyler Chang, James Michaelov, Benjamin Bergen
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- Controllable Natural Language Generation With Contrastive Prefixes Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen
- Diffuseq: Sequence To Sequence Text Generation With Diffusion Models Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- Coca: Contrastive Captioners Are Image-text Foundation Models Jiahui Yu et al.
- Zerogen: Efficient Zero-shot Learning Via Dataset Generation Jiacheng Ye et al.
- Chain-of-thought Prompting Elicits Reasoning In Large Language Models Jason Wei et al.
- Visconde: Multi-document QA With GPT-3 And Neural Reranking Jayr Pereira, Robson Fidalgo, Roberto Lotufo, Rodrigo Nogueira
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- Action-gpt: Leveraging Large-scale Language Models For Improved And Generalized Action Generation Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla
- Neural Theory-of-mind? On The Limits Of Social Intelligence In Large Lms Maarten Sap, Ronan Lebras, Daniel Fried, Yejin Choi
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Galactica: A Large Language Model For Science Ross Taylor et al.
- Language Models That Seek For Knowledge: Modular Search & Generation For Dialogue And Prompt Completion Kurt Shuster et al.
- Distilling Reasoning Capabilities Into Smaller Language Models Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan
- Who Is GPT-3? An Exploration Of Personality, Values And Demographics Marilù Miotto, Nicola Rossberg, Bennett Kleinberg
- OPT: Open Pre-trained Transformer Language Models Susan Zhang et al.
- Can Large Language Models Reason About Medical Questions? Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Self-conditioned Embedding Diffusion For Text Generation Robin Strudel et al.
- A Systematic Evaluation Of Large Language Models Of Code Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Compilable Neural Code Generation With Compiler Feedback Xin Wang et al.
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- CREPE: Can Vision-language Foundation Models Reason Compositionally? Zixian Ma et al.
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Least-to-most Prompting Enables Complex Reasoning In Large Language Models Denny Zhou et al.
- Future Transformer For Long-term Action Anticipation Dayoung Gong, Joonseok Lee, Manjin Kim, Seong Jong Ha, Minsu Cho
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- Putting Gpt-3's Creativity To The (alternative Uses) Test Claire Stevenson, Iris Smal, Matthijs Baas, Raoul Grasman, Han Van Der Maas
- Fast Inference From Transformers Via Speculative Decoding Yaniv Leviathan, Matan Kalman, Yossi Matias
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- Complexity-based Prompting For Multi-step Reasoning Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, Tushar Khot
- Why Does Surprisal From Larger Transformer-based Language Models Provide A Poorer Fit To Human Reading Times? Byung-doh Oh, William Schuler
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- Is GPT-3 A Good Data Annotator? Bosheng Ding et al.
- Super-naturalinstructions: Generalization Via Declarative Instructions On 1600+ NLP Tasks Yizhong Wang et al.
- Impact Of Pretraining Term Frequencies On Few-shot Reasoning Yasaman Razeghi, Robert L. Iv Logan, Matt Gardner, Sameer Singh
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- Analogy Generation By Prompting Large Language Models: A Case Study Of Instructgpt Bhavya Bhavya, Jinjun Xiong, Chengxiang Zhai
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- Large Language Models Are Better Reasoners With Self-verification Yixuan Weng et al.
- Automatic Chain Of Thought Prompting In Large Language Models Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Grips: Gradient-free, Edit-based Instruction Search For Prompting Large Language Models Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- Making Large Language Models Better Reasoners With Step-aware Verifier Yifei Li et al.
- The AI Teacher Test: Measuring The Pedagogical Ability Of Blender And GPT-3 In Educational Dialogues Anaïs Tack, Chris Piech
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Language Models Of Code Are Few-shot Commonsense Learners Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig
- Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier
- Text And Patterns: For Effective Chain Of Thought, It Takes Two To Tango Aman Madaan, Amir Yazdanbakhsh
- Commonsenseqa 2.0: Exposing The Limits Of AI Through Gamification Alon Talmor et al.
- Memory-assisted Prompt Editing To Improve GPT-3 After Deployment Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- A Model-agnostic Data Manipulation Method For Persona-based Dialogue Generation Yu Cao, Wei Bi, Meng Fang, Shuming Shi, Dacheng Tao
- WANLI: Worker And AI Collaboration For Natural Language Inference Dataset Creation Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi
- Transformer Language Models Without Positional Encodings Still Learn Positional Information Adi Haviv, Ori Ram, Ofir Press, Peter Izsak, Omer Levy
- Language Models Are Greedy Reasoners: A Systematic Formal Analysis Of Chain-of-thought Abulhair Saparov, He He
- Scaling Up Models And Data With \(\texttt{t5x}\) And \(\texttt{seqio}\) Adam Roberts et al.
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models Aarohi Shammie Srivastava et al.
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- Dynamic Prompt Learning Via Policy Gradient For Semi-structured Mathematical Reasoning Pan Lu et al.
- Large Language Models And The Reverse Turing Test Terrence Sejnowski
- Measuring And Narrowing The Compositionality Gap In Language Models Ofir Press et al.
- Emergent Analogical Reasoning In Large Language Models Taylor Webb, Keith J. Holyoak, Hongjing Lu
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- Unifiedskg: Unifying And Multi-tasking Structured Knowledge Grounding With Text-to-text Language Models Tianbao Xie et al.
- Thinking Fast And Slow In Large Language Models Thilo Hagendorff, Sarah Fabi, Michal Kosinski
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Transformer Feed-forward Layers Build Predictions By Promoting Concepts In The Vocabulary Space Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
- Efficient Training Of Language Models To Fill In The Middle Mohammad Bavarian et al.
- Dylora: Parameter Efficient Tuning Of Pre-trained Models Using Dynamic Search-free Low-rank Adaptation Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, Ali Ghodsi
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- Evaluating Human-language Model Interaction Mina Lee et al.
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- Coauthor: Designing A Human-ai Collaborative Writing Dataset For Exploring Language Model Capabilities Mina Lee, Percy Liang, Qian Yang
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- 3DALL-E: Integrating Text-to-image AI In 3D Design Workflows Vivian Liu, Jo Vermeulen, George Fitzmaurice, Justin Matejka
- Help Me Write A Poem: Instruction Tuning As A Vehicle For Collaborative Poetry Writing Tuhin Chakrabarty, Vishakh Padmakumar, He He
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Can Llms Express Their Uncertainty? An Empirical Evaluation Of Confidence Elicitation In Llms Miao Xiong et al.
- Do Large Language Models Resemble Humans In Language Use? Zhenguang G. Cai, Xufeng Duan, David A. Haslett, Shuqi Wang, Martin J. Pickering
- A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets Md Tahmid Rahman Laskar et al.
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Gptaraeval: A Comprehensive Evaluation Of Chatgpt On Arabic NLP Md Tawkat Islam Khondaker, Abdul Waheed, El Moatez Billah Nagoudi, Muhammad Abdul-mageed
- Unleashing The Emergent Cognitive Synergy In Large Language Models: A Task-solving Agent Through Multi-persona Self-collaboration Zhenhailong Wang et al.
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Co-writing With Opinionated Language Models Affects Users' Views Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, Mor Naaman
- Voicebox: Text-guided Multilingual Universal Speech Generation At Scale Matthew Le et al.
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- Large Language Models Effectively Leverage Document-level Context For Literary Translation, But Critical Errors Persist Marzena Karpinska, Mohit Iyyer
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- The Reversal Curse: Llms Trained On "A Is B" Fail To Learn "B Is A" Lukas Berglund et al.
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Generative Artificial Intelligence In Learning Analytics: Contextualising Opportunities And Challenges Through The Learning Analytics Cycle Lixiang Yan, Roberto Martinez-maldonado, Dragan Gašević
- Practical And Ethical Challenges Of Large Language Models In Education: A Systematic Scoping Review Lixiang Yan et al.
- Comparing Sentence-level Suggestions To Message-level Suggestions In Ai-mediated Communication Liye Fu, Benjamin Newman, Maurice Jakesch, Sarah Kreps
- Give Us The Facts: Enhancing Large Language Models With Knowledge Graphs For Fact-aware Language Modeling Linyao Yang, Hongyang Chen, Zhao Li, Xiao Ding, Xindong Wu
- Human-ai Collaboration In Thematic Analysis Using Chatgpt: A User Study And Design Recommendations Lixiang Yan et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- Judging Llm-as-a-judge With Mt-bench And Chatbot Arena Lianmin Zheng et al.
- Can Chatgpt Replace Stackoverflow? A Study On Robustness And Reliability Of Large Language Model Code Generation Li Zhong, Zilong Wang
- Zero-shot Next-item Recommendation Using Large Pretrained Language Models Lei Wang, Ee-peng Lim
- Surgicalgpt: End-to-end Language-vision GPT For Visual Question Answering In Surgery Lalithkumar Seenivasan, Mobarakol Islam, Gokul Kannan, Hongliang Ren
- Superclue: A Comprehensive Chinese Large Language Model Benchmark Liang Xu et al.
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- Just Tell Me: Prompt Engineering In Business Process Management Kiran Busch, Alexander Rochlitzer, Diana Sola, Henrik Leopold
- 14 Examples Of How Llms Can Transform Materials Science And Chemistry: A Reflection On A Large Language Model Hackathon Kevin Maik Jablonka et al.
- News Verifiers Showdown: A Comparative Performance Evaluation Of Chatgpt 3.5, Chatgpt 4.0, Bing AI, And Bard In News Fact-checking Kevin Matthe Caramancion
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Speak, Memory: An Archaeology Of Books Known To Chatgpt/gpt-4 Kent K. Chang, Mackenzie Cramer, Sandeep Soni, David Bamman
- Just Ask For Calibration: Strategies For Eliciting Calibrated Confidence Scores From Language Models Fine-tuned With Human Feedback Katherine Tian et al.
- Evaluating Language Models For Mathematics Through Interactions Katherine M. Collins et al.
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Waffling Around For Performance: Visual Classification With Random Words And Broad Concepts Karsten Roth et al.
- Chipgpt: How Far Are We From Natural Language Hardware Design Kaiyan Chang et al.
- The Imitation Game: Detecting Human And Ai-generated Texts In The Era Of Chatgpt And BARD Kadhim Hayawi, Sakib Shahriar, Sujith Samuel Mathew
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- Evaluation And Analysis Of Hallucination In Large Vision-language Models Junyang Wang et al.
- Writer-defined AI Personas For On-demand Feedback Generation Karim Benharrak, Tim Zindulka, Florian Lehmann, Hendrik Heuer, Daniel Buschek
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Recommendation As Instruction Following: A Large Language Model Empowered Recommendation Approach Junjie Zhang et al.
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- Evaluating GPT-4 And Chatgpt On Japanese Medical Licensing Examinations Jungo Kasai, Yuhei Kasai, Keisuke Sakaguchi, Yutaro Yamada, Dragomir Radev
- Chatcounselor: A Large Language Models For Mental Health Support June M. Liu et al.
- Spear Phishing With Large Language Models Julian Hazell
- Jatmo: Prompt Injection Defense By Task-specific Finetuning Julien Piet et al.
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- MEGA: Multilingual Evaluation Of Generative AI Kabir Ahuja et al.
- Phoenix: Democratizing Chatgpt Across Languages Zhihong Chen et al.
- Towards Llm-based Autograding For Short Textual Answers Johannes Schneider, Bernd Schenk, Christina Niklaus
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Gptscore: Evaluate As You Desire Jinlan Fu, See-kiong Ng, Zhengbao Jiang, Pengfei Liu
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- Structgpt: A General Framework For Large Language Model To Reason Over Structured Data Jinhao Jiang et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- Geotechnical Parrot Tales (GPT): Harnessing Large Language Models In Geotechnical Engineering Krishna Kumar
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Unified-io 2: Scaling Autoregressive Multimodal Models With Vision, Language, Audio, And Action Jiasen Lu et al.
- Set-of-mark Prompting Unleashes Extraordinary Visual Grounding In GPT-4V Jianwei Yang et al.
- Ethical Chatgpt: Concerns, Challenges, And Commandments Jianlong Zhou, Heimo Müller, Andreas Holzinger, Fang Chen
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- The Impact Of Chatgpt And Llms On Medical Imaging Stakeholders: Perspectives And Use Cases Jiancheng Yang, Hongwei Bran Li, Donglai Wei
- ICL-D3IE: In-context Learning With Diverse Demonstrations Updating For Document Information Extraction Jiabang He et al.
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- LLM Lies: Hallucinations Are Not Bugs, But Features As Adversarial Examples Jia-yu Yao et al.
- GPT-3.5, GPT-4, Or BARD? Evaluating Llms Reasoning Ability In Zero-shot Setting And Performance Boosting Through Prompts Jessica López Espejel, El Hassane Ettifouri, Mahaman Sanoussi Yahaya Alassan, El Mehdi Chouham, Walid Dahhane
- Larger Language Models Do In-context Learning Differently Jerry Wei et al.
- Artificial Muses: Generative Artificial Intelligence Chatbots Have Risen To Human-level Creativity Jennifer Haase, Paul H. P. Hanel
- Evaluating Large Language Models On A Highly-specialized Topic, Radiation Oncology Physics Jason Holmes et al.
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- Chatgpt To Replace Crowdsourcing Of Paraphrases For Intent Classification: Higher Diversity And Comparable Model Robustness Jan Cegin, Jakub Simko, Peter Brusilovsky
- Thrilled By Your Progress! Large Language Models (GPT-4) No Longer Struggle To Pass Assessments In Higher Education Programming Courses Jaromir Savelka, Arav Agarwal, Marshall An, Chris Bogart, Majd Sakr
- A Comparative Study Of Ai-generated (GPT-4) And Human-crafted Mcqs In Programming Education Jacob Doughty et al.
- Chip-chat: Challenges And Opportunities In Conversational Hardware Design Jason Blocklove, Siddharth Garg, Ramesh Karri, Hammond Pearce
- Evaluation Of Chatgpt On Biomedical Tasks: A Zero-shot Comparison With Fine-tuned Generative Transformers Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- Chatgpt In The Classroom: An Analysis Of Its Strengths And Weaknesses For Solving Undergraduate Computer Science Questions Ishika Joshi et al.
- More Robots Are Coming: Large Multimodal Models (chatgpt) Can Solve Visually Diverse Images Of Parsons Problems Irene Hou et al.
- Factuality Challenges In The Era Of Large Language Models Isabelle Augenstein et al.
- "it's Not Like Jarvis, But It's Pretty Close!" -- Examining Chatgpt's Usage Among Undergraduate Students In Computer Science Ishika Joshi, Ritvik Budhiraja, Harshal D Akolekar, Jagat Sesh Challa, Dhruv Kumar
- The Curse Of Recursion: Training On Generated Data Makes Models Forget Ilia Shumailov et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Muse: Text-to-image Generation Via Masked Generative Transformers Huiwen Chang et al.
- Llama: Open And Efficient Foundation Language Models Hugo Touvron et al.
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- Chatgpt Chemistry Assistant For Text Mining And Prediction Of MOF Synthesis Zhiling Zheng, Oufan Zhang, Christian Borgs, Jennifer T. Chayes, Omar M. Yaghi
- "it's A Fair Game", Or Is It? Examining How Users Navigate Disclosure Risks And Benefits When Using Llm-based Conversational Agents Zhiping Zhang et al.
- Building Cooperative Embodied Agents Modularly With Large Language Models Hongxin Zhang et al.
- Fingpt: Open-source Financial Large Language Models Hongyang Yang, Xiao-yang Liu, Christina Dan Wang
- Doctorglm: Fine-tuning Your Chinese Doctor Is Not A Herculean Task Honglin Xiong et al.
- Semantic Compression With Large Language Models Henry Gilbert, Michael Sandborn, Douglas C. Schmidt, Jesse Spencer-smith, Jules White
- Bioinstruct: Instruction Tuning Of Large Language Models For Biomedical Natural Language Processing Hieu Tran, Zhichao Yang, Zonghai Yao, Hong Yu
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Boosting Theory-of-mind Performance In Large Language Models Via Prompting Shima Rahimi Moghaddam, Christopher J. Honey
- Mathprompter: Mathematical Reasoning Using Large Language Models Shima Imani, Liang Du, Harsh Shrivastava
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Toolkengpt: Augmenting Frozen Language Models With Massive Tools Via Tool Embeddings Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu
- Gorilla: Large Language Model Connected With Massive Apis Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez
- Why Does Chatgpt Fall Short In Providing Truthful Answers? Shen Zheng, Jie Huang, Kevin Chen-chuan Chang
- Recommender Systems With Generative Retrieval Shashank Rajput et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Verigen: A Large Language Model For Verilog Code Generation Shailja Thakur et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- Factscore: Fine-grained Atomic Evaluation Of Factual Precision In Long Form Text Generation Sewon Min et al.
- On Codex Prompt Engineering For OCL Generation: An Empirical Study Seif Abukhalaf, Mohammad Hamdaqa, Foutse Khomh
- The Moral Authority Of Chatgpt Sebastian Krügel, Andreas Ostermaier, Matthias Uhl
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- GPT-RE: In-context Learning For Relation Extraction Using Large Language Models Zhen Wan et al.
- A Comparative Study Of Open-source Large Language Models, GPT-4 And Claude 2: Multiple-choice Test Taking In Nephrology Sean Wu et al.
- Generating Phishing Attacks Using Chatgpt Sayak Saha Roy, Krishna Vamsi Naragam, Shirin Nilizadeh
- Medalign: A Clinician-generated Dataset For Instruction Following With Electronic Medical Records Scott L. Fleming et al.
- Chatgpt Or Human? Detect And Explain. Explaining Decisions Of Machine Learning Model For Detecting Short Chatgpt-generated Text Sandra Mitrović, Davide Andreoletti, Omran Ayoub
- Let's Have A Chat! A Conversation With Chatgpt: Technology, Applications, And Limitations Sakib Shahriar, Kadhim Hayawi
- Ai-assisted Coding: Experiments With GPT-4 Russell A Poldrack, Thomas Lu, Gašper Beguš
- Verify-and-edit: A Knowledge-enhanced Chain-of-thought Framework Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, Lidong Bing
- Are Emergent Abilities Of Large Language Models A Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- Does Synthetic Data Generation Of Llms Help Clinical Text Mining? Ruixiang Tang, Xiaotian Han, Xiaoqian Jiang, Xia Hu
- Chatgpt Vs. Google: A Comparative Study Of Search Performance And User Experience Ruiyun Rayna Xu, Yue Katherine Feng, Hailiang Chen
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Gpteval: A Survey On Assessments Of Chatgpt And GPT-4 Rui Mao, Guanyi Chen, Xulang Zhang, Frank Guerin, Erik Cambria
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Audiogpt: Understanding And Generating Speech, Music, Sound, And Talking Head Rongjie Huang et al.
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- Chatgpt Is Not All You Need. A State Of The Art Review Of Large Generative AI Models Roberto Gozalo-brizuela, Eduardo C. Garrido-merchan
- Llm-assisted Content Analysis: Using Large Language Models To Support Deductive Coding Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- Chatgpt As A Factual Inconsistency Evaluator For Text Summarization Zheheng Luo, Qianqian Xie, Sophia Ananiadou
- How Secure Is Code Generated By Chatgpt? Raphaël Khoury, Anderson R. Avila, Jacob Brunelle, Baba Mamadou Camara
- Sabi\'a: Portuguese Large Language Models Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira
- Large Language Models Predict Human Sensory Judgments Across Six Modalities Raja Marjieh, Ilia Sucholutsky, Pol Van Rijn, Nori Jacoby, Thomas L. Griffiths
- Can We Trust The Evaluation On Chatgpt? Rachith Aiyappa, Jisun An, Haewoon Kwak, Yong-yeol Ahn
- Embers Of Autoregression: Understanding Large Language Models Through The Problem They Are Trained To Solve R. Thomas Mccoy, Shunyu Yao, Dan Friedman, Matthew Hardy, Thomas L. Griffiths
- Lawyer Llama Technical Report Quzhe Huang et al.
- Evaluation Of Chatgpt-generated Medical Responses: A Systematic Review And Meta-analysis Qiuhong Wei et al.
- Can Large Language Models Replace Humans In The Systematic Review Process? Evaluating Gpt-4's Efficacy In Screening And Extracting Data From Peer-reviewed And Grey Literature In Multiple Languages Qusai Khraisha, Sophie Put, Johanna Kappenberg, Azza Warraitch, Kristin Hadfield
- Translating Radiology Reports Into Plain Language Using Chatgpt And GPT-4 With Prompt Learning: Promising Results, Limitations, And Potential Qing Lyu et al.
- Faithful Chain-of-thought Reasoning Qing Lyu et al.
- Genegpt: Augmenting Large Language Models With Domain Tools For Improved Access To Biomedical Information Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu
- Medcpt: Contrastive Pre-trained Transformers With Large-scale Pubmed Search Logs For Zero-shot Biomedical Information Retrieval Qiao Jin et al.
- Are Large Language Models Geospatially Knowledgeable? Prabin Bhandari, Antonios Anastasopoulos, Dieter Pfoser
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Selfcheckgpt: Zero-resource Black-box Hallucination Detection For Generative Large Language Models Potsawee Manakul, Adian Liusie, Mark J. F. Gales
- Students' Perceptions And Preferences Of Generative Artificial Intelligence Feedback For Programming Zhengdong Zhang et al.
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- GPT Has Become Financially Literate: Insights From Financial Literacy Tests Of GPT And A Preliminary Test Of How People Use It As A Source Of Advice Paweł Niszczota, Sami Abbas
- Graphologue: Exploring Large Language Model Responses With Interactive Diagrams Peiling Jiang, Jude Rayan, Steven P. Dow, Haijun Xia
- VISAR: A Human-ai Argumentative Writing Assistant With Visual Programming And Rapid Draft Prototyping Zheng Zhang, Jie Gao, Ranjodh Singh Dhaliwal, Toby Jia-jun Li
- Internlm-xcomposer: A Vision-language Large Model For Advanced Text-image Comprehension And Composition Pan Zhang et al.
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- GPT-4 Technical Report Openai et al.
- Chameleon: Plug-and-play Compositional Reasoning With Large Language Models Pan Lu et al.
- Ontochatgpt Information System: Ontology-driven Structured Prompts For Chatgpt Meta-learning Oleksandr Palagin, Vladislav Kaverinskiy, Anna Litvin, Kyrylo Malakhov
- Hallucinations In Large Multilingual Translation Models Nuno M. Guerreiro et al.
- Large Language Models Are Built-in Autoregressive Search Engines Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
- Faith And Fate: Limits Of Transformers On Compositionality Nouha Dziri et al.
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- Chatgpt Is A Knowledgeable But Inexperienced Solver: An Investigation Of Commonsense Problem In Large Language Models Ning Bian et al.
- CAT-LM: Training Language Models On Aligned Code And Tests Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn
- Automated Annotation With Generative AI Requires Validation Nicholas Pangakis, Samuel Wolken, Neil Fasching
- Sources Of Hallucination By Large Language Models On Inference Tasks Nick Mckenna et al.
- Self-contradictory Hallucinations Of Large Language Models: Evaluation, Detection And Mitigation Niels Mündler, Jingxuan He, Slobodan Jenko, Martin Vechev
- Jais And Jais-chat: Arabic-centric Foundation And Instruction-tuned Open Generative Large Language Models Neha Sengupta et al.
- A Stitch In Time Saves Nine: Detecting And Mitigating Hallucinations Of Llms By Validating Low-confidence Generation Neeraj Varshney, Wenlin Yao, Hongming Zhang, Jianshu Chen, Dong Yu
- Large Language Models Are Zero-shot Time Series Forecasters Nate Gruver, Marc Finzi, Shikai Qiu, Andrew Gordon Wilson
- Exploring The Potential Of Large Language Models To Generate Formative Programming Feedback Natalie Kiesler, Dominic Lohr, Hieke Keuning
- Chatgpt MT: Competitive For High- (but Not Low-) Resource Languages Nathaniel R. Robinson, Perez Ogayo, David R. Mortensen, Graham Neubig
- Consistency Analysis Of Chatgpt Myeongjun Erik Jang, Thomas Lukasiewicz
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- A Review Of Chatgpt Applications In Education, Marketing, Software Engineering, And Healthcare: Benefits, Drawbacks, And Research Directions Mohammad Fraiwan, Natheer Khasawneh
- Api-bank: A Comprehensive Benchmark For Tool-augmented Llms Minghao Li et al.
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Scalable Extraction Of Training Data From (production) Language Models Milad Nasr et al.
- Evaluating Large Language Models In Theory Of Mind Tasks Michal Kosinski
- Detecting Llm-generated Text In Computing Education: A Comparative Study For Chatgpt Cases Michael Sheinman Orenstrakh, Oscar Karnalim, Carlos Anibal Suarez, Michael Liut
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- Large Language Models Are Effective Text Rankers With Pairwise Ranking Prompting Zhen Qin et al.
- Empirical Study Of Zero-shot NER With Chatgpt Tingyu Xie et al.
- Large Language Models Are State-of-the-art Evaluators Of Translation Quality Tom Kocmi, Christian Federmann
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Medalpaca -- An Open-source Collection Of Medical Conversational AI Models And Training Data Tianyu Han et al.
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Generalized Planning In PDDL Domains With Pretrained Large Language Models Tom Silver et al.
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Diagnostic Reasoning Prompts Reveal The Potential For Large Language Model Interpretability In Medicine Thomas Savage, Ashwin Nayak, Robert Gallo, Ekanath Rangan, Jonathan H Chen
- Red Teaming Chatgpt Via Jailbreaking: Bias, Robustness, Reliability And Toxicity Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing
- Deception Abilities Emerged In Large Language Models Thilo Hagendorff
- Hallusionbench: An Advanced Diagnostic Suite For Entangled Language Hallucination And Visual Illusion In Large Vision-language Models Tianrui Guan et al.
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Is Chatgpt A Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation Tao Fang et al.
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Chatgpt: Beginning Of An End Of Manual Linguistic Data Annotation? Use Case Of Automatic Genre Identification Taja Kuzman, Igor Mozetič, Nikola Ljubešić
- Sparks Of Artificial General Intelligence: Early Experiments With GPT-4 Sébastien Bubeck et al.
- Large Language Models As General Pattern Machines Suvir Mirchandani et al.
- Observations On Llms For Telecom Domain: Capabilities And Limitations Sumit Soman, Ranjani H G
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- Textbooks Are All You Need Suriya Gunasekar et al.
- Orca: Progressive Learning From Complex Explanation Traces Of GPT-4 Subhabrata Mukherjee et al.
- Transformative Effects Of Chatgpt On Modern Education: Emerging Era Of AI Chatbots Sukhpal Singh Gill et al.
- Analyzing The Performance Of GPT-3.5 And GPT-4 In Grammatical Error Correction Steven Coyne, Keisuke Sakaguchi, Diana Galvan-sosa, Michael Zock, Kentaro Inui
- AI, Write An Essay For Me: A Large-scale Comparison Of Human-written Versus Chatgpt-generated Essays Steffen Herbold, Annette Hautli-janisz, Ute Heuer, Zlata Kikteva, Alexander Trautsch
- Chatgpt Perpetuates Gender Bias In Machine Translation And Ignores Non-gendered Pronouns: Findings Across Bengali And Five Other Low-resource Languages Sourojit Ghosh, Aylin Caliskan
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Principled Instructions Are All You Need For Questioning Llama-1/2, GPT-3.5/4 Sondos Mahmoud Bsharat, Aidar Myrzakhan, Zhiqiang Shen
- Llm-empowered Chatbots For Psychiatrist And Patient Simulation: Application And Evaluation Siyuan Chen et al.
- Metagpt: Meta Programming For A Multi-agent Collaborative Framework Sirui Hong et al.
- On The Possibilities Of Ai-generated Text Detection Souradip Chakraborty et al.
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- Wikichat: Stopping The Hallucination Of Large Language Model Chatbots By Few-shot Grounding On Wikipedia Sina J. Semnani, Violet Z. Yao, Heidi C. Zhang, Monica S. Lam
- Mind Meets Machine: Unravelling Gpt-4's Cognitive Psychology Sifatkaur Dhingra, Manmeet Singh, Vaisakh Sb, Neetiraj Malviya, Sukhpal Singh Gill
- Mariogpt: Open-ended Text2level Generation Through Large Language Models Shyam Sudhakaran et al.
- Tree Of Thoughts: Deliberate Problem Solving With Large Language Models Shunyu Yao et al.
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- From Words To Watts: Benchmarking The Energy Costs Of Large Language Model Inference Siddharth Samsi et al.
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Memorybank: Enhancing Large Language Models With Long-term Memory Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, Yanlin Wang
- Inpars-v2: Large Language Models As Efficient Dataset Generators For Information Retrieval Vitor Jeronymo et al.
- Chatgpt Beyond English: Towards A Comprehensive Evaluation Of Large Language Models In Multilingual Learning Viet Dac Lai et al.
- Is GPT-4 A Reliable Rater? Evaluating Consistency In GPT-4 Text Ratings Veronika Hackl, Alexandra Elena Müller, Michael Granitzer, Maximilian Sailer
- LIDA: A Tool For Automatic Generation Of Grammar-agnostic Visualizations And Infographics Using Large Language Models Victor Dibia
- Automated Reading Passage Generation With Openai's Large Language Model Ummugul Bezirhan, Matthias Von Davier
- Automating Human Tutor-style Programming Feedback: Leveraging GPT-4 Tutor Model For Hint Generation And GPT-3.5 Student Model For Hint Validation Tung Phung et al.
- Generative AI For Programming Education: Benchmarking Chatgpt, GPT-4, And Human Tutors Tung Phung et al.
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Chinese Intermediate English Learners Outdid Chatgpt In Deep Cohesion: Evidence From English Narrative Writing Tongquan Zhou, Siyi Cao, Siruo Zhou, Yao Zhang, Aijing He
- Trusting Your Evidence: Hallucinate Less With Context-aware Decoding Weijia Shi et al.
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- Copiloting The Copilots: Fusing Large Language Models With Completion Engines For Automated Program Repair Yuxiang Wei, Chunqiu Steven Xia, Lingming Zhang
- A Preliminary Evaluation Of Chatgpt For Zero-shot Dialogue Understanding Wenbo Pan, Qiguang Chen, Xiao Xu, Wanxiang Che, Libo Qin
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- Can Large Language Models Provide Useful Feedback On Research Papers? A Large-scale Empirical Analysis Weixin Liang et al.
- Is Chatgpt Equipped With Emotional Dialogue Capabilities? Weixiang Zhao et al.
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Layoutgpt: Compositional Visual Planning And Generation With Large Language Models Weixi Feng et al.
- Chatgraph: Interpretable Text Classification By Converting Chatgpt Knowledge To Graphs Yucheng Shi et al.
- R2gengpt: Radiology Report Generation With Frozen Llms Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou
- Chatgpt For PLC/DCS Control Logic Generation Heiko Koziolek, Sten Gruener, Virendra Ashiwal
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine Harsha Nori et al.
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Extractive Summarization Via Chatgpt For Faithful Summary Generation Haopeng Zhang, Xiao Liu, Jiawei Zhang
- Chatgpt Or Grammarly? Evaluating Chatgpt On Grammatical Error Correction Benchmark Haoran Wu, Wenxuan Wang, Yuxuan Wan, Wenxiang Jiao, Michael Lyu
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Q-instruct: Improving Low-level Visual Abilities For Multi-modality Foundation Models Haoning Wu et al.
- Autodroid: Llm-powered Task Automation In Android Hao Wen et al.
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Reasoning Implicit Sentiment With Chain-of-thought Prompting Hao Fei et al.
- Visual Instruction Tuning Haotian Liu, Chunyuan Li, Qingyang Wu, Yong Jae Lee
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Personallm: Investigating The Ability Of Large Language Models To Express Personality Traits Hang Jiang et al.
- Choice Over Control: How Users Write With Large Language Models Using Diegetic And Non-diegetic Prompting Hai Dang, Sven Goller, Florian Lehmann, Daniel Buschek
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Wizardmath: Empowering Mathematical Reasoning For Large Language Models Via Reinforced Evol-instruct Haipeng Luo et al.
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Revisiting Large Language Models As Zero-shot Relation Extractors Guozheng Li, Peng Wang, Wenjun Ke
- Auggpt: Leveraging Chatgpt For Text Data Augmentation Haixing Dai et al.
- Exploring The Psychology Of Llms' Moral And Legal Reasoning Guilherme F. C. F. Almeida, José Luiz Nunes, Neele Engelmann, Alex Wiegmann, Marcelo De Araújo
- Chatgpt Hallucinates When Attributing Answers Guido Zuccon, Bevan Koopman, Razia Shaik
- Perspectives On Large Language Models For Relevance Judgment Guglielmo Faggioli et al.
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Dr Chatgpt, Tell Me What I Want To Hear: How Prompt Knowledge Impacts Health Answer Correctness Guido Zuccon, Bevan Koopman
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Performance Of The Pre-trained Large Language Model GPT-4 On Automated Short Answer Grading Gerd Kortemeyer
- Lawbench: Benchmarking Legal Knowledge Of Large Language Models Zhiwei Fei et al.
- Navgpt: Explicit Reasoning In Vision-and-language Navigation With Large Language Models Gengze Zhou, Yicong Hong, Qi Wu
- Do Large Language Models Show Decision Heuristics Similar To Humans? A Case Study Using GPT-3.5 Gaurav Suri, Lily R. Slater, Ali Ziaee, Morgan Nguyen
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Lost In Translation: Large Language Models In Non-english Content Analysis Gabriel Nicholas, Aliya Bhatia
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Is Chatgpt Better Than Human Annotators? Potential And Limitations Of Chatgpt In Explaining Implicit Hate Speech Fan Huang, Haewoon Kwak, Jisun An
- Chatgpt Outperforms Crowd-workers For Text-annotation Tasks Fabrizio Gilardi, Meysam Alizadeh, Maël Kubli
- Learning To Reason Over Scene Graphs: A Case Study Of Finetuning GPT-2 Into A Robot Language Model For Grounded Task Planning Georgia Chalvatzaki et al.
- Learning To Prompt In The Classroom To Understand AI Limits: A Pilot Study Emily Theophilou et al.
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Aligning Large Multimodal Models With Factually Augmented RLHF Zhiqing Sun et al.
- Llm-adapters: An Adapter Family For Parameter-efficient Fine-tuning Of Large Language Models Zhiqiang Hu et al.
- Simulating H.P. Lovecraft Horror Literature With The Chatgpt Large Language Model Eduardo C. Garrido-merchán, José Luis Arroyo-barrigüete, Roberto Gozalo-brizuela
- Gptutor: A Chatgpt-powered Programming Tool For Code Explanation Eason Chen, Ray Huang, Han-shin Chen, Yuen-hsien Tseng, Liang-yi Li
- Vipergpt: Visual Inference Via Python Execution For Reasoning Dídac Surís, Sachit Menon, Carl Vondrick
- The Falcon Series Of Open Language Models Ebtesam Almazrouei et al.
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- Speechgpt: Empowering Large Language Models With Intrinsic Cross-modal Conversational Abilities Dong Zhang et al.
- Evaluating Open-domain Question Answering In The Era Of Large Language Models Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- Chatgpt Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions Deyao Zhu et al.
- Evaluating GPT-3.5 And GPT-4 Models On Brazilian University Admission Exams Desnes Nunes, Ricardo Primi, Ramon Pires, Roberto Lotufo, Rodrigo Nogueira
- Fine-tuning Chatgpt For Automatic Scoring Ehsan Latif, Xiaoming Zhai
- Using An LLM To Help With Code Understanding Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, Brad Myers
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- CORE-GPT: Combining Open Access Research And Large Language Models For Credible, Trustworthy Question Answering David Pride, Matteo Cancellieri, Petr Knoth
- Response: Emergent Analogical Reasoning In Large Language Models Damian Hodel, Jevin West
- Have Llms Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models Daman Arora, Himanshu Gaurav Singh, Mausam
- Improving Accuracy Of GPT-3/4 Results On Biomedical Data Using A Retrieval-augmented Language Model David Soong et al.
- AI And The FCI: Can Chatgpt Project An Understanding Of Introductory Physics? Colin G. West
- Weak-to-strong Generalization: Eliciting Strong Capabilities With Weak Supervision Collin Burns et al.
- Llava-med: Training A Large Language-and-vision Assistant For Biomedicine In One Day Chunyuan Li et al.
- LIMA: Less Is More For Alignment Chunting Zhou et al.
- Chatgpt Evaluation On Sentence Level Relations: A Focus On Temporal, Causal, And Discourse Relations Chunkit Chan et al.
- Conversational Automated Program Repair Chunqiu Steven Xia, Lingming Zhang
- Progressive-hint Prompting Improves Reasoning In Large Language Models Chuanyang Zheng, Zhengying Liu, Enze Xie, Zhenguo Li, Yu Li
- Distilled GPT For Source Code Summarization Chia-yi Su, Collin Mcmillan
- Llm-powered Data Augmentation For Enhanced Cross-lingual Performance Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- Is Chatgpt A General-purpose Natural Language Processing Task Solver? Chengwei Qin et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- Memgpt: Towards Llms As Operating Systems Charles Packer et al.
- Supporting Human-ai Collaboration In Auditing Llms With Llms Charvi Rastogi, Marco Tulio Ribeiro, Nicholas King, Harsha Nori, Saleema Amershi
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Hallucination Augmented Contrastive Learning For Multimodal Large Language Model Chaoya Jiang et al.
- Wizardlm: Empowering Large Language Models To Follow Complex Instructions Can Xu et al.
- Does GPT-4 Pass The Turing Test? Cameron R. Jones, Benjamin K. Bergen
- Pmc-llama: Towards Building Open-source Language Models For Medicine Chaoyi Wu et al.
- Chatgpt And A New Academic Reality: Artificial Intelligence-written Research Papers And The Ethics Of The Large Language Models In Scholarly Publishing Brady Lund et al.
- Large Language Models On Graphs: A Comprehensive Survey Bowen Jin et al.
- Prompting Or Fine-tuning? A Comparative Study Of Large Language Models For Taxonomy Construction Boqi Chen, Fandi Yi, Dániel Varró
- MIMIC-IT: Multi-modal In-context Instruction Tuning Bo Li et al.
- Seed-bench-2: Benchmarking Multimodal Large Language Models Bohao Li et al.
- Video-llava: Learning United Visual Representation By Alignment Before Projection Bin Lin et al.
- Evaluation Of Chatgpt For Nlp-based Mental Health Applications Bishal Lamichhane
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Large Language Models In The Workplace: A Case Study On Prompt Engineering For Job Type Classification Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, Thomas Brightwell
- Bad Actor, Good Advisor: Exploring The Role Of Large Language Models In Fake News Detection Beizhe Hu et al.
- Friend Or Foe? Exploring The Implications Of Large Language Models On The Science System Benedikt Fecher, Marcel Hebing, Melissa Laufer, Jörg Pohle, Fabian Sofsky
- Check Your Facts And Try Again: Improving Large Language Models With External Knowledge And Automated Feedback Baolin Peng et al.
- Instruction Tuning With GPT-4 Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- How Close Is Chatgpt To Human Experts? Comparison Corpus, Evaluation, And Detection Biyang Guo et al.
- A Study Of Generative Large Language Model For Medical Research And Healthcare Cheng Peng et al.
- Coupling Large Language Models With Logic Programming For Robust And General Reasoning From Text Zhun Yang, Adam Ishay, Joohyung Lee
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- Refactoring Programs Using Large Language Models With Few-shot Examples Atsushi Shirafuji, Yusuke Oda, Jun Suzuki, Makoto Morishita, Yutaka Watanobe
- The False Promise Of Imitating Proprietary Llms Arnav Gudibande et al.
- Exploring The Responses Of Large Language Models To Beginner Programmers' Help Requests Arto Hellas et al.
- Better Zero-shot Reasoning With Role-play Prompting Aobo Kong et al.
- Chatgpt: Applications, Opportunities, And Threats Aram Bahrini et al.
- Med-halt: Medical Domain Hallucination Test For Large Language Models Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Universal And Transferable Adversarial Attacks On Aligned Language Models Andy Zou et al.
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- Chemcrow: Augmenting Large-language Models With Chemistry Tools Andres M Bran et al.
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models Anas Awadalla et al.
- Generative AI: Implications And Applications For Education Anastasia Olnancy Olga et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- How Good Are GPT Models At Machine Translation? A Comprehensive Evaluation Amr Hendy et al.
- Chatgpt Is A Remarkable Tool -- For Experts Amos Azaria, Rina Azoulay, Shulamit Reches
- Fighting Fire With Fire: Can Chatgpt Detect Ai-generated Text? Amrita Bhattacharjee, Huan Liu
- Toxicity In Chatgpt: Analyzing Persona-assigned Language Models Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan
- Self-refine: Iterative Refinement With Self-feedback Aman Madaan et al.
- Multilingual Machine Translation With Large Language Models: Empirical Results And Analysis Wenhao Zhu et al.
- A Categorical Archive Of Chatgpt Failures Ali Borji
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Smoothllm: Defending Large Language Models Against Jailbreaking Attacks Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas
- Poisoning Language Models During Instruction Tuning Alexander Wan, Eric Wallace, Sheng Shen, Dan Klein
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Chatgpt: More Than A Weapon Of Mass Deception, Ethical Challenges And Responses From The Human-centered Artificial Intelligence (HCAI) Perspective Alejo Jose G. Sison, Marco Tulio Daza, Roberto Gozalo-brizuela, Eduardo C. Garrido-merchán
- What Does CLIP Know About A Red Circle? Visual Prompt Engineering For Vlms Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
- Self-rag: Learning To Retrieve, Generate, And Critique Through Self-reflection Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi
- Can Chatgpt And Bard Generate Aligned Assessment Items? A Reliability Analysis Against Human Performance Abdolvahab Khademi
- Conversational Ai-powered Design: Chatgpt As Designer, User, And Product A. Baki Kocaballi
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- MM-REACT: Prompting Chatgpt For Multimodal Reasoning And Action Zhengyuan Yang et al.
- Translating Natural Language To Planning Goals With Large-language Models Yaqi Xie et al.
- RTLLM: An Open-source Benchmark For Design RTL Generation With Large Language Model Yao Lu, Shang Liu, Qijun Zhang, Zhiyao Xie
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- Can Chatgpt Reproduce Human-generated Labels? A Study Of Social Computing Tasks Yiming Zhu, Peixian Zhang, Ehsan-ul Haq, Pan Hui, Gareth Tyson
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- Can Chatgpt Replace Traditional KBQA Models? An In-depth Analysis Of The Question Answering Performance Of The GPT LLM Family Yiming Tan et al.
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- A Comprehensive Survey Of Ai-generated Content (AIGC): A History Of Generative AI From GAN To Chatgpt Yihan Cao et al.
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- "kelly Is A Warm Person, Joseph Is A Role Model": Gender Biases In Llm-generated Reference Letters Yixin Wan et al.
- Jailbreaking Chatgpt Via Prompt Engineering: An Empirical Study Yi Liu et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Hugginggpt: Solving AI Tasks With Chatgpt And Its Friends In Hugging Face Yongliang Shen et al.
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- Assessing Cross-cultural Alignment Between Chatgpt And Human Societies: An Empirical Study Yong Cao et al.
- Autotamp: Autoregressive Task And Motion Planning With Llms As Translators And Checkers Yongchao Chen et al.
- How Far Can Camels Go? Exploring The State Of Instruction Tuning On Open Resources Yizhong Wang et al.
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- Analyzing And Mitigating Object Hallucination In Large Vision-language Models Yiyang Zhou et al.
- Pandagpt: One Model To Instruction-follow Them All Yixuan Su et al.
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- The Dark Side Of Chatgpt: Legal And Ethical Challenges From Stochastic Parrots And Hallucination Zihao Li
- Llavar: Enhanced Visual Instruction Tuning For Text-rich Image Understanding Yanzhe Zhang et al.
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Bubogpt: Enabling Visual Grounding In Multi-modal Llms Yang Zhao et al.
- Specializing Smaller Language Models Towards Multi-step Reasoning Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, Tushar Khot
- G-eval: NLG Evaluation Using GPT-4 With Better Human Alignment Yang Liu et al.
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Emotional Intelligence Of Large Language Models Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Liu Jia
- Classeval: A Manually-crafted Benchmark For Evaluating Llms On Class-level Code Generation Xueying Du et al.
- Improving Language Model Negotiation With Self-play And In-context Learning From AI Feedback Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata
- Can Chatgpt Pass The Vietnamese National High School Graduation Examination? Xuan-quy Dao, Ngoc-bich Le, Xuan-dung Phan, Bac-bien Ngo
- Performance Comparison Of Large Language Models On VNHSGE English Dataset: Openai Chatgpt, Microsoft Bing Chat, And Google Bard Xuan-quy Dao
- In Chatgpt We Trust? Measuring And Characterizing The Reliability Of Chatgpt Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- How Robust Is GPT-3.5 To Predecessors? A Comprehensive Study On Language Understanding Tasks Xuanting Chen et al.
- Wavcaps: A Chatgpt-assisted Weakly-labelled Audio Captioning Dataset For Audio-language Multimodal Research Xinhao Mei et al.
- How To Unleash The Power Of Large Language Models For Few-shot Relation Extraction? Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
- Rethinking The Evaluation For Conversational Recommendation In The Era Of Large Language Models Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-rong Wen
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Deceptive AI Ecosystems: The Case Of Chatgpt Xiao Zhan, Yifan Xu, Stefan Sarkadi
- HPC-GPT: Integrating Large Language Model For High-performance Computing Xianzhong Ding et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Don't Trust Chatgpt When Your Question Is Not In English: A Study Of Multilingual Abilities And Types Of Llms Xiang Zhang, Senyu Li, Bradley Hauer, Ning Shi, Grzegorz Kondrak
- MMMU: A Massive Multi-discipline Multimodal Understanding And Reasoning Benchmark For Expert AGI Xiang Yue et al.
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- Cogagent: A Visual Language Model For GUI Agents Wenyi Hong et al.
- M3exam: A Multilingual, Multimodal, Multilevel Benchmark For Examining Large Language Models Wenxuan Zhang, Sharifah Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing
- Is Chatgpt A Good Translator? Yes With GPT-4 As The Engine Wenxiang Jiao et al.
- Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
- Longbench: A Bilingual, Multitask Benchmark For Long Context Understanding Yushi Bai et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Chatdoctor: A Medical Chat Model Fine-tuned On A Large Language Model Meta-ai (llama) Using Medical Domain Knowledge Yunxiang Li et al.
- Large Language Models Are Zero-shot Rankers For Recommender Systems Yupeng Hou et al.
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Character-llm: A Trainable Agent For Role-playing Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu
- Exploring The Impact Of Instruction Data Scaling On Large Language Models: An Empirical Study On Real-world Use Cases Yunjie Ji et al.
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- MEDITRON-70B: Scaling Medical Pretraining For Large Language Models Zeming Chen et al.
- Is Chatgpt A Good Sentiment Analyzer? A Preliminary Study Zengzhi Wang et al.
- C-eval: A Multi-level Multi-discipline Chinese Evaluation Suite For Foundation Models Yuzhen Huang et al.
- Learning Gain Differences Between Chatgpt And Human Tutor Generated Algebra Hints Zachary A. Pardos, Shreya Bhandari
- Let The Llms Talk: Simulating Human-to-human Conversational QA Via Zero-shot Llm-to-llm Interactions Zahra Abbasiantaeb, Yifei Yuan, Evangelos Kanoulas, Mohammad Aliannejadi
- Monitoring Ai-modified Content At Scale: A Case Study On The Impact Of Chatgpt On AI Conference Peer Reviews Weixin Liang et al.
- Earthgpt: A Universal Multi-modal Large Language Model For Multi-sensor Image Comprehension In Remote Sensing Domain Wei Zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Mao
- Assessing AI Detectors In Identifying Ai-generated Code: Implications For Education Wei Hung Pan et al.
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- Chatgpt As Research Scientist: Probing Gpt's Capabilities As A Research Librarian, Research Ethicist, Data Generator And Data Predictor Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R. Banaji
- Eyes Wide Shut? Exploring The Visual Shortcomings Of Multimodal Llms Shengbang Tong et al.
- Beyond Code Generation: An Observational Study Of Chatgpt Usage In Software Engineering Practice Ranim Khojah, Mazen Mohamad, Philipp Leitner, Francisco Gomes De Oliveira Neto
- Hidden Flaws Behind Expert-level Accuracy Of Multimodal GPT-4 Vision In Medicine Qiao Jin et al.
- Me Llama: Foundation Large Language Models For Medical Applications Qianqian Xie et al.
- From Text To Transformation: A Comprehensive Review Of Large Language Models' Versatility Pravneet Kaur et al.
- Large Language Model Capabilities In Perioperative Risk Prediction And Prognostication Philip Chung et al.
- SNIFFER: Multimodal Large Language Model For Explainable Out-of-context Misinformation Detection Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- History Of Generative Artificial Intelligence (AI) Chatbots: Past, Present, And Future Development Md. Al-amin et al.
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- Large Legal Fictions: Profiling Legal Hallucinations In Large Language Models Matthew Dahl, Varun Magesh, Mirac Suzgun, Daniel E. Ho
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Data Is All You Need: Finetuning Llms For Chip Design Via An Automated Design-data Augmentation Framework Kaiyan Chang et al.
- Clochat: Understanding How People Customize, Interact, And Experience Personas In Large Language Models Juhye Ha, Hyeon Jeon, Daeun Han, Jinwook Seo, Changhoon Oh
- Feedback-generation For Programming Exercises With GPT-4 Imen Azaiz, Natalie Kiesler, Sven Strickroth
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Benchmarking Retrieval-augmented Generation For Medicine Guangzhi Xiong, Qiao Jin, Zhiyong Lu, Aidong Zhang
- Closing The Gap Between Open-source And Commercial Large Language Models For Medical Evidence Summarization Gongbo Zhang et al.
- Building Better AI Agents: A Provocation On The Utilisation Of Persona In Llm-based Conversational Agents Guangzhi Sun, Xiao Zhan, Jose Such
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context Gemini Team et al.
- Code-aware Prompting: A Study Of Coverage Guided Test Generation In Regression Setting Using LLM Gabriel Ryan et al.
- Ai-tutoring In Software Engineering Education Eduard Frankford, Clemens Sauerwein, Patrick Bassner, Stephan Krusche, Ruth Breu
- Chemllm: A Chemical Large Language Model Di Zhang et al.
- Deepseek-coder: When The Large Language Model Meets Programming -- The Rise Of Code Intelligence Daya Guo et al.
- Generative AI In EU Law: Liability, Privacy, Intellectual Property, And Cybersecurity Claudio Novelli, Federico Casolari, Philipp Hacker, Giorgio Spedicato, Luciano Floridi
- Open Source Language Models Can Provide Feedback: Evaluating Llms' Ability To Help Students Using Gpt-4-as-a-judge Charles Koutcheme et al.
- Homogenization Effects Of Large Language Models On Human Creative Ideation Barrett R. Anderson, Jash Hemant Shah, Max Kreminski
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- Gemini Goes To Med School: Exploring The Capabilities Of Multimodal Large Language Models On Medical Challenge Problems & Hallucinations Ankit Pal, Malaikannan Sankarasubbu
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- Why And When Llm-based Assistants Can Go Wrong: Investigating The Effectiveness Of Prompt-based Interactions For Software Help-seeking Anjali Khurana, Hari Subramonyam, Parmit K Chilana
- Financial Statement Analysis With Large Language Models Alex Kim, Maximilian Muhn, Valeri Nikolaev
- Large Language Models For Data Annotation And Synthesis: A Survey Zhen Tan et al.
- Quality Of Answers Of Generative Large Language Models Vs Peer Patients For Interpreting Lab Test Results For Lay Patients: Evaluation Study Zhe He et al.
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites Zhe Chen et al.
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- How Johnny Can Persuade Llms To Jailbreak Them: Rethinking Persuasion To Challenge AI Safety By Humanizing Llms Yi Zeng et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
🏷 Has Code
- Triviaqa: A Large Scale Distantly Supervised Challenge Dataset For Reading Comprehension Mandar Joshi, Eunsol Choi, Daniel S. Weld, Luke Zettlemoyer
- Parlai: A Dialog Research Software Platform Alexander H. Miller et al.
- DP-GAN: Diversity-promoting Generative Adversarial Network For Generating Informative And Diversified Text Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu Sun
- Improving Machine Reading Comprehension With General Reading Strategies Kai Sun, Dian Yu, Dong Yu, Claire Cardie
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- How Can We Know What Language Models Know? Zhengbao Jiang, Frank F. Xu, Jun Araki, Graham Neubig
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Attention Is Not Explanation Sarthak Jain, Byron C. Wallace
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Nemo: A Toolkit For Building AI Applications Using Neural Modules Oleksii Kuchaiev et al.
- Mixture Content Selection For Diverse Sequence Generation Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
- Cross-lingual Natural Language Generation Via Pre-training Zewen Chi et al.
- LAMOL: Language Modeling For Lifelong Language Learning Fan-keng Sun, Cheng-hao Ho, Hung-yi Lee
- Language Models As Knowledge Bases? Fabio Petroni et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection Guangxiang Zhao et al.
- MUSE: Parallel Multi-scale Attention For Sequence To Sequence Learning Guangxiang Zhao, Xu Sun, Jingjing Xu, Zhiyuan Zhang, Liangchen Luo
- Repurposing Entailment For Multi-hop Question Answering Tasks Harsh Trivedi, Heeyoung Kwon, Tushar Khot, Ashish Sabharwal, Niranjan Balasubramanian
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- VL-BERT: Pre-training Of Generic Visual-linguistic Representations Weijie Su et al.
- CTRL: A Conditional Transformer Language Model For Controllable Generation Nitish Shirish Keskar, Bryan Mccann, Lav R. Varshney, Caiming Xiong, Richard Socher
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Cosmos QA: Machine Reading Comprehension With Contextual Commonsense Reasoning Lifu Huang, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- ALBERT: A Lite BERT For Self-supervised Learning Of Language Representations Zhenzhong Lan et al.
- Improving Knowledge-aware Dialogue Generation Via Knowledge Base Question Answering Jian Wang et al.
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- UNIMO: Towards Unified-modal Understanding And Generation Via Cross-modal Contrastive Learning Wei Li et al.
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- Russiansuperglue: A Russian Language Understanding Evaluation Benchmark Tatiana Shavrina et al.
- Phobert: Pre-trained Language Models For Vietnamese Dat Quoc Nguyen, Anh Tuan Nguyen
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Deebert: Dynamic Early Exiting For Accelerating BERT Inference Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin
- Unnatural Language Inference Koustuv Sinha, Prasanna Parthasarathi, Joelle Pineau, Adina Williams
- Detecting Hallucinated Content In Conditional Neural Sequence Generation Chunting Zhou et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Coreferential Reasoning Learning For Language Representation Deming Ye et al.
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- Logic2text: High-fidelity Natural Language Generation From Logical Forms Zhiyu Chen et al.
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- Long Range Arena: A Benchmark For Efficient Transformers Yi Tay et al.
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- MART: Memory-augmented Recurrent Transformer For Coherent Video Paragraph Captioning Jie Lei et al.
- A Large-scale Chinese Short-text Conversation Dataset Yida Wang et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- Non-autoregressive Machine Translation With Disentangled Context Transformer Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
- Auto-captions On GIF: A Large-scale Video-sentence Dataset For Vision-language Pre-training Yingwei Pan et al.
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- How Much Knowledge Can You Pack Into The Parameters Of A Language Model? Adam Roberts, Colin Raffel, Noam Shazeer
- How Can We Know When Language Models Know? On The Calibration Of Language Models For Question Answering Zhengbao Jiang, Jun Araki, Haibo Ding, Graham Neubig
- Incorporating External Knowledge Through Pre-training For Natural Language To Code Generation Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- X-FACTR: Multilingual Factual Knowledge Retrieval From Pretrained Language Models Zhengbao Jiang, Antonios Anastasopoulos, Jun Araki, Haibo Ding, Graham Neubig
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- CERT: Contrastive Self-supervised Learning For Language Understanding Hongchao Fang, Sicheng Wang, Meng Zhou, Jiayuan Ding, Pengtao Xie
- XLM-T: Scaling Up Multilingual Machine Translation With Pretrained Cross-lingual Transformer Encoders Shuming Ma et al.
- Length-adaptive Transformer: Train Once With Length Drop, Use Anytime With Search Gyuwan Kim, Kyunghyun Cho
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Very Deep Transformers For Neural Machine Translation Xiaodong Liu, Kevin Duh, Liyuan Liu, Jianfeng Gao
- Indic-transformers: An Analysis Of Transformer Language Models For Indian Languages Kushal Jain, Adwait Deshpande, Kumar Shridhar, Felix Laumann, Ayushman Dash
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- HAT: Hardware-aware Transformers For Efficient Natural Language Processing Hanrui Wang et al.
- Delight: Deep And Light-weight Transformer Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Funnel-transformer: Filtering Out Sequential Redundancy For Efficient Language Processing Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- On Transferability Of Prompt Tuning For Natural Language Processing Yusheng Su et al.
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- P-tuning V2: Prompt Tuning Can Be Comparable To Fine-tuning Universally Across Scales And Tasks Xiao Liu et al.
- Contrastive Learning For Many-to-many Multilingual Neural Machine Translation Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
- Summ^n: A Multi-stage Summarization Framework For Long Input Dialogues And Documents Yusen Zhang et al.
- Bartscore: Evaluating Generated Text As Text Generation Weizhe Yuan, Graham Neubig, Pengfei Liu
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- Efficient Passage Retrieval With Hashing For Open-domain Question Answering Ikuya Yamada, Akari Asai, Hannaneh Hajishirzi
- Less Is More: Pre-train A Strong Text Encoder For Dense Retrieval Using A Weak Decoder Shuqi Lu et al.
- Multitask Prompted Training Enables Zero-shot Task Generalization Victor Sanh et al.
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- One Chatbot Per Person: Creating Personalized Chatbots Based On Implicit User Profiles Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, Ji-rong Wen
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Arat5: Text-to-text Transformers For Arabic Language Generation El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-mageed
- Investigating The Limitations Of Transformers With Simple Arithmetic Tasks Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- Controllable Generation From Pre-trained Language Models Via Inverse Prompting Xu Zou et al.
- Trankit: A Light-weight Transformer-based Toolkit For Multilingual Natural Language Processing Minh Van Nguyen, Viet Dac Lai, Amir Pouran Ben Veyseh, Thien Huu Nguyen
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- Improving And Simplifying Pattern Exploiting Training Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, Colin Raffel
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- Efficient Large-scale Language Model Training On GPU Clusters Using Megatron-lm Deepak Narayanan et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Zero-shot Recommendation As Language Modeling Damien Sileo, Wout Vossen, Robbe Raymaekers
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-image Pre-training Paradigm Yangguang Li et al.
- Dialogue State Tracking With A Language Model Using Schema-driven Prompting Chia-hsuan Lee, Hao Cheng, Mari Ostendorf
- Terapipe: Token-level Pipeline Parallelism For Training Large-scale Language Models Zhuohan Li et al.
- N\"UWA: Visual Synthesis Pre-training For Neural Visual World Creation Chenfei Wu et al.
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners Ningyu Zhang et al.
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- Denseclip: Language-guided Dense Prediction With Context-aware Prompting Yongming Rao et al.
- General-purpose Question-answering With Macaw Oyvind Tafjord, Peter Clark
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- Commitbert: Commit Message Generation Using Pre-trained Programming Language Model Tae-hwan Jung
- Towards Continual Knowledge Learning Of Language Models Joel Jang et al.
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- XTREME-R: Towards More Challenging And Nuanced Multilingual Evaluation Sebastian Ruder et al.
- Sentence-t5: Scalable Sentence Encoders From Pre-trained Text-to-text Models Jianmo Ni et al.
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- Taming Sparsely Activated Transformer With Stochastic Experts Simiao Zuo et al.
- Deltalm: Encoder-decoder Pre-training For Language Generation And Translation By Augmenting Pretrained Multilingual Encoders Shuming Ma et al.
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Compacter: Efficient Low-rank Hypercomplex Adapter Layers Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder
- Variational Information Bottleneck For Effective Low-resource Fine-tuning Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Unifying Vision-and-language Tasks Via Text Generation Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal
- Compm: Context Modeling With Speaker's Pre-trained Memory Tracking For Emotion Recognition In Conversation Joosung Lee, Wooin Lee
- Generated Knowledge Prompting For Commonsense Reasoning Jiacheng Liu et al.
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- Few-shot Knowledge Graph-to-text Generation With Pretrained Language Models Junyi Li et al.
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- Exploring Visual Prompts For Adapting Large-scale Models Hyojin Bahng, Ali Jahanian, Swami Sankaranarayanan, Phillip Isola
- One Embedder, Any Task: Instruction-finetuned Text Embeddings Hongjin Su et al.
- Selective Annotation Makes Language Models Better Few-shot Learners Hongjin Su et al.
- Interleaving Retrieval With Chain-of-thought Reasoning For Knowledge-intensive Multi-step Questions Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- Prompt Tuning For Generative Multimodal Pretrained Models Hao Yang et al.
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- Program Of Thoughts Prompting: Disentangling Computation From Reasoning For Numerical Reasoning Tasks Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen
- A Length-extrapolatable Transformer Yutao Sun et al.
- Large Language Models Are Few(1)-shot Table Reasoners Wenhu Chen
- Reasoning With Language Model Prompting: A Survey Shuofei Qiao et al.
- Decoupling Knowledge From Memorization: Retrieval-augmented Prompt Learning Xiang Chen et al.
- Multi-stage Prompting For Knowledgeable Dialogue Generation Zihan Liu et al.
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Smoothquant: Accurate And Efficient Post-training Quantization For Large Language Models Guangxuan Xiao et al.
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- Recitation-augmented Language Models Zhiqing Sun, Xuezhi Wang, Yi Tay, Yiming Yang, Denny Zhou
- Prototypical Verbalizer For Prompt-based Few-shot Tuning Ganqu Cui, Shengding Hu, Ning Ding, Longtao Huang, Zhiyuan Liu
- Language Models Are Multilingual Chain-of-thought Reasoners Freda Shi et al.
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Mass-editing Memory In A Transformer Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau
- Minicons: Enabling Flexible Behavioral And Representational Analyses Of Transformer Language Models Kanishka Misra
- Speechprompt: An Exploration Of Prompt Tuning On Generative Spoken Language Model For Speech Processing Tasks Kai-wei Chang, Wei-cheng Tseng, Shang-wen Li, Hung-yi Lee
- BLIP: Bootstrapping Language-image Pre-training For Unified Vision-language Understanding And Generation Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
- Do Language Models Plagiarize? Jooyoung Lee, Thai Le, Jinghui Chen, Dongwon Lee
- When To Make Exceptions: Exploring Language Models As Accounts Of Human Moral Judgment Zhijing Jin et al.
- Can Large Language Models Truly Understand Prompts? A Case Study With Negated Prompts Joel Jang, Seonghyeon Ye, Minjoon Seo
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- Unified-io: A Unified Model For Vision, Language, And Multi-modal Tasks Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
- Lilt: A Simple Yet Effective Language-independent Layout Transformer For Structured Document Understanding Jiapeng Wang, Lianwen Jin, Kai Ding
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Diffuseq: Sequence To Sequence Text Generation With Diffusion Models Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- React: Synergizing Reasoning And Acting In Language Models Shunyu Yao et al.
- Visconde: Multi-document QA With GPT-3 And Neural Reranking Jayr Pereira, Robson Fidalgo, Roberto Lotufo, Rodrigo Nogueira
- Benchmarking Large Language Models For Automated Verilog RTL Code Generation Shailja Thakur et al.
- Convfinqa: Exploring The Chain Of Numerical Reasoning In Conversational Finance Question Answering Zhiyu Chen et al.
- Dall-eval: Probing The Reasoning Skills And Social Biases Of Text-to-image Generation Models Jaemin Cho, Abhay Zala, Mohit Bansal
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- PAL: Program-aided Language Models Luyu Gao et al.
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Efficient Few-shot Learning Without Prompts Lewis Tunstall et al.
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- Distilling Reasoning Capabilities Into Smaller Language Models Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan
- Do As I Can, Not As I Say: Grounding Language In Robotic Affordances Michael Ahn et al.
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- Language Models With Image Descriptors Are Strong Few-shot Video-language Learners Zhenhailong Wang et al.
- A Systematic Evaluation Of Large Language Models Of Code Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn
- Vindlu: A Recipe For Effective Video-and-language Pretraining Feng Cheng et al.
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Codegen: An Open Large Language Model For Code With Multi-turn Program Synthesis Erik Nijkamp et al.
- Memory-based Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- Self-adaptive In-context Learning: An Information Compression Perspective For In-context Example Selection And Ordering Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong
- Altclip: Altering The Language Encoder In CLIP For Extended Language Capabilities Zhongzhi Chen et al.
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- The Stack: 3 TB Of Permissively Licensed Source Code Denis Kocetkov et al.
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- Protoclip: Prototypical Contrastive Language Image Pretraining Delong Chen et al.
- Factpegasus: Factuality-aware Pre-training And Fine-tuning For Abstractive Summarization David Wan, Mohit Bansal
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Incoder: A Generative Model For Code Infilling And Synthesis Daniel Fried et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- Democratizing Contrastive Language-image Pre-training: A CLIP Benchmark Of Data, Model, And Supervision Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao
- Prompt For Extraction? PAIE: Prompting Argument Interaction For Event Argument Extraction Yubo Ma et al.
- Linearly Mapping From Image To Text Space Jack Merullo, Louis Castricato, Carsten Eickhoff, Ellie Pavlick
- Texts As Images In Prompt Tuning For Multi-label Image Recognition Zixian Guo et al.
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- LERT: A Linguistically-motivated Pre-trained Language Model Yiming Cui, Wanxiang Che, Shijin Wang, Ting Liu
- Multi-lingual Evaluation Of Code Generation Models Ben Athiwaratkun et al.
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- Large Language Models Are Better Reasoners With Self-verification Yixuan Weng et al.
- Retrieval Augmentation Of Large Language Models For Lay Language Generation Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- Automatic Chain Of Thought Prompting In Large Language Models Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola
- Generative Language Models For Paragraph-level Question Generation Asahi Ushio, Fernando Alva-manchego, Jose Camacho-collados
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Grips: Gradient-free, Edit-based Instruction Search For Prompting Large Language Models Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Plug-and-play VQA: Zero-shot VQA By Conjoining Large Pretrained Models With Zero Training Anthony Meng Huat Tiong, Junnan Li, Boyang Li, Silvio Savarese, Steven C. H. Hoi
- Clinical-longformer And Clinical-bigbird: Transformers For Long Clinical Sequences Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier
- Memory-assisted Prompt Editing To Improve GPT-3 After Deployment Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang
- Dual Modality Prompt Tuning For Vision-language Pre-trained Model Yinghui Xing et al.
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- Dualprompt: Complementary Prompting For Rehearsal-free Continual Learning Zifeng Wang et al.
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- LASP: Text-to-text Optimization For Language-aware Soft Prompting Of Vision & Language Models Adrian Bulat, Georgios Tzimiropoulos
- Scaling Up Models And Data With \(\texttt{t5x}\) And \(\texttt{seqio}\) Adam Roberts et al.
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- Grounding Language With Visual Affordances Over Unstructured Data Oier Mees, Jessica Borja-diaz, Wolfram Burgard
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- No Language Left Behind: Scaling Human-centered Machine Translation Nllb Team et al.
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- Learning To Compose Soft Prompts For Compositional Zero-shot Learning Nihal V. Nayak, Peilin Yu, Stephen H. Bach
- Unifiedskg: Unifying And Multi-tasking Structured Knowledge Grounding With Text-to-text Language Models Tianbao Xie et al.
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Vl-checklist: Evaluating Pre-trained Vision-language Models With Objects, Attributes And Relations Tiancheng Zhao et al.
- Crosslingual Generalization Through Multitask Finetuning Niklas Muennighoff et al.
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- Generate Rather Than Retrieve: Large Language Models Are Strong Context Generators Wenhao Yu et al.
- Maple: Multi-modal Prompt Learning Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Toxigen: A Large-scale Machine-generated Dataset For Adversarial And Implicit Hate Speech Detection Thomas Hartvigsen et al.
- KALA: Knowledge-augmented Language Model Adaptation Minki Kang, Jinheon Baek, Sung Ju Hwang
- Towards A Unified Multi-dimensional Evaluator For Text Generation Ming Zhong et al.
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Promptsource: An Integrated Development Environment And Repository For Natural Language Prompts Stephen H. Bach et al.
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Unleashing The Emergent Cognitive Synergy In Large Language Models: A Task-solving Agent Through Multi-persona Self-collaboration Zhenhailong Wang et al.
- Internvl: Scaling Up Vision Foundation Models And Aligning For Generic Visual-linguistic Tasks Zhe Chen et al.
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- Applenet: Visual Attention Parameterized Prompt Learning For Few-shot Remote Sensing Image Generalization Using CLIP Mainak Singha, Ankit Jha, Bhupendra Solanki, Shirsha Bose, Biplab Banerjee
- The Reversal Curse: Llms Trained On "A Is B" Fail To Learn "B Is A" Lukas Berglund et al.
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Llm-grounded Diffusion: Enhancing Prompt Understanding Of Text-to-image Diffusion Models With Large Language Models Long Lian, Boyi Li, Adam Yala, Trevor Darrell
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- Do Llms Exhibit Human-like Response Biases? A Case Study In Survey Design Lindia Tjuatja, Valerie Chen, Sherry Tongshuang Wu, Ameet Talwalkar, Graham Neubig
- Improving CLIP Training With Language Rewrites Lijie Fan, Dilip Krishnan, Phillip Isola, Dina Katabi, Yonglong Tian
- Judging Llm-as-a-judge With Mt-bench And Chatbot Arena Lianmin Zheng et al.
- Logic-lm: Empowering Large Language Models With Symbolic Solvers For Faithful Logical Reasoning Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- Layoutllm-t2i: Eliciting Layout Guidance From LLM For Text-to-image Generation Leigang Qu, Shengqiong Wu, Hao Fei, Liqiang Nie, Tat-seng Chua
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Zero-shot Next-item Recommendation Using Large Pretrained Language Models Lei Wang, Ee-peng Lim
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- Automatic Prompt Augmentation And Selection With Chain-of-thought From Labeled Data Kashun Shum, Shizhe Diao, Tong Zhang
- Geochat: Grounded Large Vision-language Model For Remote Sensing Kartik Kuckreja et al.
- Waffling Around For Performance: Visual Classification With Random Words And Broad Concepts Karsten Roth et al.
- Tinyclip: CLIP Distillation Via Affinity Mimicking And Weight Inheritance Kan Stephen Wu et al.
- Full Parameter Fine-tuning For Large Language Models With Limited Resources Kai Lv et al.
- ALIP: Adaptive Language-image Pre-training With Synthetic Caption Kaicheng Yang et al.
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- Evaluating GPT-4 And Chatgpt On Japanese Medical Licensing Examinations Jungo Kasai, Yuhei Kasai, Keisuke Sakaguchi, Yutaro Yamada, Dragomir Radev
- Honeybee: Locality-enhanced Projector For Multimodal LLM Junbum Cha, Wooyoung Kang, Jonghwan Mun, Byungseok Roh
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Jatmo: Prompt Injection Defense By Task-specific Finetuning Julien Piet et al.
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- Phoenix: Democratizing Chatgpt Across Languages Zhihong Chen et al.
- Exploring The Benefits Of Training Expert Language Models Over Instruction Tuning Joel Jang et al.
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- Gptscore: Evaluate As You Desire Jinlan Fu, See-kiong Ng, Zhengbao Jiang, Pengfei Liu
- Structgpt: A General Framework For Large Language Model To Reason Over Structured Data Jinhao Jiang et al.
- Set-of-mark Prompting Unleashes Extraordinary Visual Grounding In GPT-4V Jianwei Yang et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- Imagebind-llm: Multi-modality Instruction Tuning Jiaming Han et al.
- Compositional Exemplars For In-context Learning Jiacheng Ye, Zhiyong Wu, Jiangtao Feng, Tao Yu, Lingpeng Kong
- Onellm: One Framework To Align All Modalities With Language Jiaming Han et al.
- ICL-D3IE: In-context Learning With Diverse Demonstrations Updating For Document Information Extraction Jiabang He et al.
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- LLM Lies: Hallucinations Are Not Bugs, But Features As Adversarial Examples Jia-yu Yao et al.
- Simple And Controllable Music Generation Jade Copet et al.
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- Fingpt: Open-source Financial Large Language Models Hongyang Yang, Xiao-yang Liu, Christina Dan Wang
- Doctorglm: Fine-tuning Your Chinese Doctor Is Not A Herculean Task Honglin Xiong et al.
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Gorilla: Large Language Model Connected With Massive Apis Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez
- The Flan Collection: Designing Data And Methods For Effective Instruction Tuning Shayne Longpre et al.
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Audiogpt: Understanding And Generating Speech, Music, Sound, And Talking Head Rongjie Huang et al.
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Prompt-based Distribution Alignment For Unsupervised Domain Adaptation Shuanghao Bai et al.
- Generating With Confidence: Uncertainty Quantification For Black-box Large Language Models Zhen Lin, Shubhendu Trivedi, Jimeng Sun
- Codegeex: A Pre-trained Model For Code Generation With Multilingual Benchmarking On Humaneval-x Qinkai Zheng et al.
- Mplug-owl: Modularization Empowers Large Language Models With Multimodality Qinghao Ye et al.
- Adalora: Adaptive Budget Allocation For Parameter-efficient Fine-tuning Qingru Zhang et al.
- Active Retrieval Augmented Generation Zhengbao Jiang et al.
- Prompting The Hidden Talent Of Web-scale Speech Models For Zero-shot Task Generalization Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Chat-univi: Unified Visual Representation Empowers Large Language Models With Image And Video Understanding Peng Jin, Ryuichi Takanobu, Wancai Zhang, Xiaochun Cao, Li Yuan
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- Internlm-xcomposer: A Vision-language Large Model For Advanced Text-image Comprehension And Composition Pan Zhang et al.
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- Large Language Models Are Built-in Autoregressive Search Engines Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- Jais And Jais-chat: Arabic-centric Foundation And Instruction-tuned Open Generative Large Language Models Neha Sengupta et al.
- Self-regulating Prompts: Foundational Model Adaptation Without Forgetting Muhammad Uzair Khattak et al.
- Do Llms Understand Social Knowledge? Evaluating The Sociability Of Large Language Models With Socket Benchmark Minje Choi, Jiaxin Pei, Sagar Kumar, Chang Shu, David Jurgens
- A Simple And Effective Pruning Approach For Large Language Models Mingjie Sun, Zhuang Liu, Anna Bair, J. Zico Kolter
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan
- Codekgc: Code Language Model For Generative Knowledge Graph Construction Zhen Bi et al.
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Few-shot In-context Learning For Knowledge Base Question Answering Tianle Li et al.
- Hallusionbench: An Advanced Diagnostic Suite For Entangled Language Hallucination And Visual Illusion In Large Vision-language Models Tianrui Guan et al.
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- Emergent And Predictable Memorization In Large Language Models Stella Biderman et al.
- Pythia: A Suite For Analyzing Large Language Models Across Training And Scaling Stella Biderman et al.
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Principled Instructions Are All You Need For Questioning Llama-1/2, GPT-3.5/4 Sondos Mahmoud Bsharat, Aidar Myrzakhan, Zhiqiang Shen
- Metagpt: Meta Programming For A Multi-agent Collaborative Framework Sirui Hong et al.
- Mariogpt: Open-ended Text2level Generation Through Large Language Models Shyam Sudhakaran et al.
- Tree Of Thoughts: Deliberate Problem Solving With Large Language Models Shunyu Yao et al.
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Inpars-v2: Large Language Models As Efficient Dataset Generators For Information Retrieval Vitor Jeronymo et al.
- LIDA: A Tool For Automatic Generation Of Grammar-agnostic Visualizations And Infographics Using Large Language Models Victor Dibia
- Can Ai-generated Text Be Reliably Detected? Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, Soheil Feizi
- Evaluating Correctness And Faithfulness Of Instruction-following Models For Question Answering Vaibhav Adlakha, Parishad Behnamghader, Xing Han Lu, Nicholas Meade, Siva Reddy
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- BLIVA: A Simple Multimodal LLM For Better Handling Of Text-rich Visual Questions Wenbo Hu et al.
- Roco: Dialectic Multi-robot Collaboration With Large Language Models Zhao Mandi, Shreeya Jain, Shuran Song
- R2gengpt: Radiology Report Generation With Frozen Llms Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou
- Ferret: Refer And Ground Anything Anywhere At Any Granularity Haoxuan You et al.
- Autodroid: Llm-powered Task Automation In Android Hao Wen et al.
- Lmdrive: Closed-loop End-to-end Driving With Large Language Models Hao Shao et al.
- Reasoning Implicit Sentiment With Chain-of-thought Prompting Hao Fei et al.
- Mplug-2: A Modularized Multi-modal Foundation Model Across Text, Image And Video Haiyang Xu et al.
- Wizardmath: Empowering Mathematical Reasoning For Large Language Models Via Reinforced Evol-instruct Haipeng Luo et al.
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Lawbench: Benchmarking Legal Knowledge Of Large Language Models Zhiwei Fei et al.
- Cheap And Quick: Efficient Vision-language Instruction Tuning For Large Language Models Gen Luo et al.
- Text Matching Improves Sequential Recommendation By Reducing Popularity Biases Zhenghao Liu et al.
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Repocoder: Repository-level Code Completion Through Iterative Retrieval And Generation Fengji Zhang et al.
- Exploring Human-like Translation Strategy With Large Language Models Zhiwei He et al.
- Empower Large Language Model To Perform Better On Industrial Domain-specific Question Answering Fangkai Yang et al.
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Towards Efficient Fine-tuning Of Pre-trained Code Models: An Experimental Study And Beyond Ensheng Shi et al.
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Lasuie: Unifying Information Extraction With Latent Adaptive Structure-aware Generative Language Model Hao Fei et al.
- Aligning Large Multimodal Models With Factually Augmented RLHF Zhiqing Sun et al.
- Kosmos-2: Grounding Multimodal Large Language Models To The World Zhiliang Peng et al.
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- MELTR: Meta Loss Transformer For Learning To Fine-tune Video Foundation Models Dohwan Ko et al.
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- Chatgpt Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions Deyao Zhu et al.
- Large Language Models For Generative Information Extraction: A Survey Derong Xu et al.
- Evaluating GPT-3.5 And GPT-4 Models On Brazilian University Admission Exams Desnes Nunes, Ricardo Primi, Ramon Pires, Roberto Lotufo, Rodrigo Nogueira
- Chateval: Towards Better Llm-based Evaluators Through Multi-agent Debate Chi-min Chan et al.
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- Chatdev: Communicative Agents For Software Development Chen Qian et al.
- Memgpt: Towards Llms As Operating Systems Charles Packer et al.
- MME: A Comprehensive Evaluation Benchmark For Multimodal Large Language Models Chaoyou Fu et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- Hallucination Augmented Contrastive Learning For Multimodal Large Language Model Chaoya Jiang et al.
- K2: A Foundation Language Model For Geoscience Knowledge Understanding And Utilization Cheng Deng et al.
- Wizardlm: Empowering Large Language Models To Follow Complex Instructions Can Xu et al.
- Compositional Chain-of-thought Prompting For Large Multimodal Models Chancharik Mitra, Brandon Huang, Trevor Darrell, Roei Herzig
- Pmc-llama: Towards Building Open-source Language Models For Medicine Chaoyi Wu et al.
- Large Language Models On Graphs: A Comprehensive Survey Bowen Jin et al.
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- LLM+P: Empowering Large Language Models With Optimal Planning Proficiency Bo Liu et al.
- Seed-bench-2: Benchmarking Multimodal Large Language Models Bohao Li et al.
- Video-llava: Learning United Visual Representation By Alignment Before Projection Bin Lin et al.
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- How Close Is Chatgpt To Human Experts? Comparison Corpus, Evaluation, And Detection Biyang Guo et al.
- Better Zero-shot Reasoning With Role-play Prompting Aobo Kong et al.
- Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models Anas Awadalla et al.
- Fighting Fire With Fire: Can Chatgpt Detect Ai-generated Text? Amrita Bhattacharjee, Huan Liu
- On Generative Agents In Recommendation An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-seng Chua
- Multilingual Machine Translation With Large Language Models: Empirical Results And Analysis Wenhao Zhu et al.
- Smoothllm: Defending Large Language Models Against Jailbreaking Attacks Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas
- MM-REACT: Prompting Chatgpt For Multimodal Reasoning And Action Zhengyuan Yang et al.
- Collaborative Large Language Model For Recommender Systems Yaochen Zhu, Liang Wu, Qi Guo, Liangjie Hong, Jundong Li
- Flexgen: High-throughput Generative Inference Of Large Language Models With A Single GPU Ying Sheng et al.
- Element-aware Summarization With Large Language Models: Expert-aligned Evaluation And Chain-of-thought Method Yiming Wang, Zhuosheng Zhang, Rui Wang
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- Can Chatgpt Replace Traditional KBQA Models? An In-depth Analysis Of The Question Answering Performance Of The GPT LLM Family Yiming Tan et al.
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- A Comparative Study Of Pretrained Language Models For Long Clinical Text Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Evaluating Object Hallucination In Large Vision-language Models Yifan Li et al.
- Human-centric Autonomous Systems With Llms For User Command Reasoning Yi Yang et al.
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- Autotamp: Autoregressive Task And Motion Planning With Llms As Translators And Checkers Yongchao Chen et al.
- How Far Can Camels Go? Exploring The State Of Instruction Tuning On Open Resources Yizhong Wang et al.
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- Analyzing And Mitigating Object Hallucination In Large Vision-language Models Yiyang Zhou et al.
- Pandagpt: One Model To Instruction-follow Them All Yixuan Su et al.
- When Prompt-based Incremental Learning Does Not Meet Strong Pretraining Yu-ming Tang, Yi-xing Peng, Wei-shi Zheng
- Llavar: Enhanced Visual Instruction Tuning For Text-rich Image Understanding Yanzhe Zhang et al.
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Bubogpt: Enabling Visual Grounding In Multi-modal Llms Yang Zhao et al.
- Llama-vid: An Image Is Worth 2 Tokens In Large Language Models Yanwei Li, Chengyao Wang, Jiaya Jia
- G-eval: NLG Evaluation Using GPT-4 With Better Human Alignment Yang Liu et al.
- Specinfer: Accelerating Generative Large Language Model Serving With Tree-based Speculative Inference And Verification Xupeng Miao et al.
- Classeval: A Manually-crafted Benchmark For Evaluating Llms On Class-level Code Generation Xueying Du et al.
- Speak Foreign Languages With Your Own Voice: Cross-lingual Neural Codec Language Modeling Ziqiang Zhang et al.
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- Wavcaps: A Chatgpt-assisted Weakly-labelled Audio Captioning Dataset For Audio-language Multimodal Research Xinhao Mei et al.
- Llm-pruner: On The Structural Pruning Of Large Language Models Xinyin Ma, Gongfan Fang, Xinchao Wang
- How To Unleash The Power Of Large Language Models For Few-shot Relation Extraction? Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
- Rethinking The Evaluation For Conversational Recommendation In The Era Of Large Language Models Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-rong Wen
- PMC-VQA: Visual Instruction Tuning For Medical Visual Question Answering Xiaoman Zhang et al.
- LISA: Reasoning Segmentation Via Large Language Model Xin Lai et al.
- Medagents: Large Language Models As Collaborators For Zero-shot Medical Reasoning Xiangru Tang et al.
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Cogagent: A Visual Language Model For GUI Agents Wenyi Hong et al.
- M3exam: A Multilingual, Multimodal, Multilevel Benchmark For Examining Large Language Models Wenxuan Zhang, Sharifah Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing
- Is Chatgpt A Good Translator? Yes With GPT-4 As The Engine Wenxiang Jiao et al.
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- Longbench: A Bilingual, Multitask Benchmark For Long Context Understanding Yushi Bai et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Editing Large Language Models: Problems, Methods, And Opportunities Yunzhi Yao et al.
- Lampilot: An Open Benchmark Dataset For Autonomous Driving With Language Model Programs Yunsheng Ma et al.
- Large Language Models Are Zero-shot Rankers For Recommender Systems Yupeng Hou et al.
- Describe, Explain, Plan And Select: Interactive Planning With Large Language Models Enables Open-world Multi-task Agents Zihao Wang et al.
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- Cachegen: KV Cache Compression And Streaming For Fast Large Language Model Serving Yuhan Liu et al.
- Contextual Object Detection With Multimodal Large Language Models Yuhang Zang, Wei Li, Jun Han, Kaiyang Zhou, Chen Change Loy
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- Aligning Large Language Models With Human: A Survey Yufei Wang et al.
- Preventing Zero-shot Transfer Degradation In Continual Learning Of Vision-language Models Zangwei Zheng et al.
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Fine-grained Human Feedback Gives Better Rewards For Language Model Training Zeqiu Wu et al.
- Billm: Pushing The Limit Of Post-training Quantization For Llms Wei Huang et al.
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- The Effect Of Sampling Temperature On Problem Solving In Large Language Models Matthew Renze, Erhan Guven
- What Large Language Models Know And What People Think They Know Mark Steyvers et al.
- The Dawn After The Dark: An Empirical Study On Factuality Hallucination In Large Language Models Junyi Li et al.
- Fine Tuning Vs. Retrieval Augmented Generation For Less Popular Knowledge Heydar Soudani, Evangelos Kanoulas, Faegheh Hasibi
- Chemllm: A Chemical Large Language Model Di Zhang et al.
- Moe-llava: Mixture Of Experts For Large Vision-language Models Bin Lin et al.
- Gemini Goes To Med School: Exploring The Capabilities Of Multimodal Large Language Models On Medical Challenge Problems & Hallucinations Ankit Pal, Malaikannan Sankarasubbu
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites Zhe Chen et al.
- A Survey On Lora Of Large Language Models Yuren Mao et al.
- Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Llamafactory: Unified Efficient Fine-tuning Of 100+ Language Models Yaowei Zheng et al.
- Datasets For Large Language Models: A Comprehensive Survey Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
- Measurement Of Llm's Philosophies Of Human Nature Minheng Ni et al.
🏷 ICLR
🏷 ICML
🏷 In-Context Learning
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- MAGMA -- Multimodal Augmentation Of Generative Models Through Adapter-based Finetuning Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
- Meta-learning Via Language Model In-context Tuning Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He
- Glam: Efficient Scaling Of Language Models With Mixture-of-experts Nan Du et al.
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Learning To Retrieve Prompts For In-context Learning Ohad Rubin, Jonathan Herzig, Jonathan Berant
- Metaicl: Learning To Learn In Context Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi
- Diverse Demonstrations Improve In-context Compositional Generalization Itay Levy, Ben Bogin, Jonathan Berant
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- Rethinking The Role Of Demonstrations: What Makes In-context Learning Work? Sewon Min et al.
- On The Effect Of Pretraining Corpora On In-context Learning By A Large-scale Language Model Seongjin Shin et al.
- Selective Annotation Makes Language Models Better Few-shot Learners Hongjin Su et al.
- Few-shot Parameter-efficient Fine-tuning Is Better And Cheaper Than In-context Learning Haokun Liu et al.
- Teaching Algorithmic Reasoning Via In-context Learning Hattie Zhou et al.
- In-context Learning For Few-shot Dialogue State Tracking Yushi Hu et al.
- Large Language Models Are Few(1)-shot Table Reasoners Wenhu Chen
- The Unreliability Of Explanations In Few-shot Prompting For Textual Reasoning Xi Ye, Greg Durrett
- PAL: Program-aided Language Models Luyu Gao et al.
- In-context Examples Selection For Machine Translation Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, Marjan Ghazvininejad
- Data Distributional Properties Drive Emergent In-context Learning In Transformers Stephanie C. Y. Chan et al.
- Legal Prompting: Teaching A Language Model To Think Like A Lawyer Fangyi Yu, Lee Quartey, Frank Schilder
- Self-adaptive In-context Learning: An Information Compression Perspective For In-context Example Selection And Ordering Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong
- Prompting Palm For Translation: Assessing Strategies And Performance David Vilar et al.
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- Exploring Length Generalization In Large Language Models Cem Anil et al.
- In-context Learning And Induction Heads Catherine Olsson et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- Multi-lingual Evaluation Of Code Generation Models Ben Athiwaratkun et al.
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Large Language Models Are Human-level Prompt Engineers Yongchao Zhou et al.
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- Text And Patterns: For Effective Chain Of Thought, It Takes Two To Tango Aman Madaan, Amir Yazdanbakhsh
- Can Language Models Learn From Explanations In Context? Andrew K. Lampinen et al.
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- Grounding Language With Visual Affordances Over Unstructured Data Oier Mees, Jessica Borja-diaz, Wolfram Burgard
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- Challenging Big-bench Tasks And Whether Chain-of-thought Can Solve Them Mirac Suzgun et al.
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Voicebox: Text-guided Multilingual Universal Speech Generation At Scale Matthew Le et al.
- Few-shot Fine-tuning Vs. In-context Learning: A Fair Comparison And Evaluation Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar
- Enhancing Few-shot Text-to-sql Capabilities Of Large Language Models: A Study On Prompt Design Strategies Linyong Nan et al.
- Improving CLIP Training With Language Rewrites Lijie Fan, Dilip Krishnan, Phillip Isola, Dina Katabi, Yonglong Tian
- Layoutllm-t2i: Eliciting Layout Guidance From LLM For Text-to-image Generation Leigang Qu, Shengqiong Wu, Hao Fei, Liqiang Nie, Tat-seng Chua
- Query2doc: Query Expansion With Large Language Models Liang Wang, Nan Yang, Furu Wei
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- Grounding Language Models To Images For Multimodal Inputs And Outputs Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Compositional Exemplars For In-context Learning Jiacheng Ye, Zhiyong Wu, Jiangtao Feng, Tao Yu, Lingpeng Kong
- ICL-D3IE: In-context Learning With Diverse Demonstrations Updating For Document Information Extraction Jiabang He et al.
- VILA: On Pre-training For Visual Language Models Ji Lin et al.
- Larger Language Models Do In-context Learning Differently Jerry Wei et al.
- Symbol Tuning Improves In-context Learning In Language Models Jerry Wei et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- Boosting Theory-of-mind Performance In Large Language Models Via Prompting Shima Rahimi Moghaddam, Christopher J. Honey
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Large Language Model Augmented Narrative Driven Recommendations Sheshera Mysore, Andrew Mccallum, Hamed Zamani
- Toolkengpt: Augmenting Frozen Language Models With Massive Tools Via Tool Embeddings Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu
- GPT-RE: In-context Learning For Relation Extraction Using Large Language Models Zhen Wan et al.
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- In-context Learning Creates Task Vectors Roee Hendel, Mor Geva, Amir Globerson
- Genegpt: Augmenting Large Language Models With Domain Tools For Improved Access To Biomedical Information Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu
- DIN-SQL: Decomposed In-context Learning Of Text-to-sql With Self-correction Mohammadreza Pourreza, Davood Rafiei
- Few-shot In-context Learning For Knowledge Base Question Answering Tianle Li et al.
- Is Chatgpt A Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation Tao Fang et al.
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Large Language Models As General Pattern Machines Suvir Mirchandani et al.
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Freshllms: Refreshing Large Language Models With Search Engine Augmentation Tu Vu et al.
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Do We Still Need Clinical Language Models? Eric Lehman et al.
- Language Model Crossover: Variation Through Few-shot Prompting Elliot Meyerson et al.
- Kosmos-2: Grounding Multimodal Large Language Models To The World Zhiliang Peng et al.
- Chatgpt Evaluation On Sentence Level Relations: A Focus On Temporal, Causal, And Discourse Relations Chunkit Chan et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Generative Speech Recognition Error Correction With Large Language Models And Task-activating Prompting Chao-han Huck Yang et al.
- MIMIC-IT: Multi-modal In-context Instruction Tuning Bo Li et al.
- ART: Automatic Multi-step Reasoning And Tool-use For Large Language Models Bhargavi Paranjape et al.
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Improving Language Model Negotiation With Self-play And In-context Learning From AI Feedback Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata
- Speak Foreign Languages With Your Own Voice: Cross-lingual Neural Codec Language Modeling Ziqiang Zhang et al.
- How To Unleash The Power Of Large Language Models For Few-shot Relation Extraction? Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
- Textbooks Are All You Need II: Phi-1.5 Technical Report Yuanzhi Li et al.
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- MM1: Methods, Analysis & Insights From Multimodal LLM Pre-training Brandon Mckinzie et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
🏷 Interpretability and Explainability
- Disentangling Language And Knowledge In Task-oriented Dialogs Dinesh Raghu, Nikhil Gupta, Mausam
- Multi-cast Attention Networks For Retrieval-based Question Answering And Response Prediction Yi Tay, Luu Anh Tuan, Siu Cheung Hui
- Recosa: Detecting The Relevant Contexts With Self-attention For Multi-turn Dialogue Generation Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng
- Attention Is Not Explanation Sarthak Jain, Byron C. Wallace
- Answering Complex Open-domain Questions Through Iterative Query Generation Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, Christopher D. Manning
- Learning From Explanations With Neural Execution Tree Ziqi Wang et al.
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- Abductive Commonsense Reasoning Chandra Bhagavatula et al.
- Explain Yourself! Leveraging Language Models For Commonsense Reasoning Nazneen Fatema Rajani, Bryan Mccann, Caiming Xiong, Richard Socher
- SEAL: Segment-wise Extractive-abstractive Long-form Text Summarization Yao Zhao, Mohammad Saleh, Peter J. Liu
- SPARTA: Efficient Open-domain Question Answering Via Sparse Transformer Matching Retrieval Tiancheng Zhao, Xiaopeng Lu, Kyusong Lee
- Transformers As Soft Reasoners Over Language Peter Clark, Oyvind Tafjord, Kyle Richardson
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- WT5?! Training Text-to-text Models To Explain Their Predictions Sharan Narang et al.
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- Visbert: Hidden-state Visualizations For Transformers Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Natural Language Rationales With Full-stack Visual Reasoning: From Pixels To Semantic Frames To Commonsense Graphs Ana Marasović et al.
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- Constructing A Multi-hop QA Dataset For Comprehensive Evaluation Of Reasoning Steps Xanh Ho, Anh-khoa Duong Nguyen, Saku Sugawara, Akiko Aizawa
- Generate Natural Language Explanations For Recommendation Hanxiong Chen, Xu Chen, Shaoyun Shi, Yongfeng Zhang
- Generic Attention-model Explainability For Interpreting Bi-modal And Encoder-decoder Transformers Hila Chefer, Shir Gur, Lior Wolf
- Explaining Documents' Relevance To Search Queries Razieh Rahimi, Youngwoo Kim, Hamed Zamani, James Allan
- Personalized Transformer For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- KAT: A Knowledge Augmented Transformer For Vision-and-language Liangke Gui et al.
- Automated Quality Assessment Of Cognitive Behavioral Therapy Sessions Through Highly Contextualized Language Representations Nikolaos Flemotomos et al.
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- On Hallucination And Predictive Uncertainty In Conditional Language Generation Yijun Xiao, William Yang Wang
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Reframing Human-ai Collaboration For Generating Free-text Explanations Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- On The Paradox Of Learning To Reason From Data Honghua Zhang, Liunian Harold Li, Tao Meng, Kai-wei Chang, Guy Van Den Broeck
- Cogvideo: Large-scale Pretraining For Text-to-video Generation Via Transformers Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- Rethinking With Retrieval: Faithful Large Language Model Inference Hangfeng He, Hongming Zhang, Dan Roth
- The Unreliability Of Explanations In Few-shot Prompting For Textual Reasoning Xi Ye, Greg Durrett
- Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, Antonios Anastasopoulos, Yulia Tsvetkov
- Automatic Generation Of Programming Exercises And Code Explanations Using Large Language Models Sami Sarsa, Paul Denny, Arto Hellas, Juho Leinonen
- React: Synergizing Reasoning And Acting In Language Models Shunyu Yao et al.
- Visconde: Multi-document QA With GPT-3 And Neural Reranking Jayr Pereira, Robson Fidalgo, Roberto Lotufo, Rodrigo Nogueira
- Maieutic Prompting: Logically Consistent Reasoning With Recursive Explanations Jaehun Jung et al.
- Personalized Prompt Learning For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Legal Prompting: Teaching A Language Model To Think Like A Lawyer Fangyi Yu, Lee Quartey, Frank Schilder
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Analogy Generation By Prompting Large Language Models: A Case Study Of Instructgpt Bhavya Bhavya, Jinjun Xiong, Chengxiang Zhai
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Retrieval Augmentation Of Large Language Models For Lay Language Generation Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- Can Language Models Learn From Explanations In Context? Andrew K. Lampinen et al.
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- ROSCOE: A Suite Of Metrics For Scoring Step-by-step Reasoning Olga Golovneva et al.
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- Paperqa: Retrieval-augmented Generative Agent For Scientific Research Jakub Lála et al.
- Thrilled By Your Progress! Large Language Models (GPT-4) No Longer Struggle To Pass Assessments In Higher Education Programming Courses Jaromir Savelka, Arav Agarwal, Marshall An, Chris Bogart, Majd Sakr
- More Robots Are Coming: Large Multimodal Models (chatgpt) Can Solve Visually Diverse Images Of Parsons Problems Irene Hou et al.
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Scaling Vision-language Models With Sparse Mixture Of Experts Sheng Shen et al.
- Chatgpt Or Human? Detect And Explain. Explaining Decisions Of Machine Learning Model For Detecting Short Chatgpt-generated Text Sandra Mitrović, Davide Andreoletti, Omran Ayoub
- Are Emergent Abilities Of Large Language Models A Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Faithful Chain-of-thought Reasoning Qing Lyu et al.
- Designerly Understanding: Information Needs For Model Transparency To Support Design Ideation For Ai-powered User Experience Q. Vera Liao, Hariharan Subramonyam, Jennifer Wang, Jennifer Wortman Vaughan
- AI Transparency In The Age Of Llms: A Human-centered Research Roadmap Q. Vera Liao, Jennifer Wortman Vaughan
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Large Language Model Alignment: A Survey Tianhao Shen et al.
- Diagnostic Reasoning Prompts Reveal The Potential For Large Language Model Interpretability In Medicine Thomas Savage, Ashwin Nayak, Robert Gallo, Ekanath Rangan, Jonathan H Chen
- Orca: Progressive Learning From Complex Explanation Traces Of GPT-4 Subhabrata Mukherjee et al.
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Better Patching Using LLM Prompting, Via Self-consistency Toufique Ahmed, Premkumar Devanbu
- Roco: Dialectic Multi-robot Collaboration With Large Language Models Zhao Mandi, Shreeya Jain, Shuran Song
- Chatgraph: Interpretable Text Classification By Converting Chatgpt Knowledge To Graphs Yucheng Shi et al.
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Languagempc: Large Language Models As Decision Makers For Autonomous Driving Hao Sha et al.
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Gender Bias And Stereotypes In Large Language Models Hadas Kotek, Rikker Dockum, David Q. Sun
- Augmented Language Models: A Survey Grégoire Mialon et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Lost In Translation: Large Language Models In Non-english Content Analysis Gabriel Nicholas, Aliya Bhatia
- Is Chatgpt Better Than Human Annotators? Potential And Limitations Of Chatgpt In Explaining Implicit Hate Speech Fan Huang, Haewoon Kwak, Jisun An
- Gptutor: A Chatgpt-powered Programming Tool For Code Explanation Eason Chen, Ray Huang, Han-shin Chen, Yuen-hsien Tseng, Liang-yi Li
- Vipergpt: Visual Inference Via Python Execution For Reasoning Dídac Surís, Sachit Menon, Carl Vondrick
- The Vector Grounding Problem Dimitri Coelho Mollo, Raphaël Millière
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- Evaluating GPT-3.5 And GPT-4 Models On Brazilian University Admission Exams Desnes Nunes, Ricardo Primi, Ramon Pires, Roberto Lotufo, Rodrigo Nogueira
- A Short Survey Of Viewing Large Language Models In Legal Aspect Zhongxiang Sun
- Can Large Language Models Transform Computational Social Science? Caleb Ziems et al.
- Orca 2: Teaching Small Language Models How To Reason Arindam Mitra et al.
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Teaching Large Language Models To Self-debug Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou
- Rethinking The Evaluation For Conversational Recommendation In The Era Of Large Language Models Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-rong Wen
- Describe, Explain, Plan And Select: Interactive Planning With Large Language Models Enables Open-world Multi-task Agents Zihao Wang et al.
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Eyes Wide Shut? Exploring The Visual Shortcomings Of Multimodal Llms Shengbang Tong et al.
- Large Language Model Capabilities In Perioperative Risk Prediction And Prognostication Philip Chung et al.
- SNIFFER: Multimodal Large Language Model For Explainable Out-of-context Misinformation Detection Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- What Large Language Models Know And What People Think They Know Mark Steyvers et al.
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Rethinking Interpretability In The Era Of Large Language Models Chandan Singh, Jeevana Priya Inala, Michel Galley, Rich Caruana, Jianfeng Gao
🏷 INTERSPEECH
🏷 KDD
🏷 Language Modeling
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- An Actor-critic Algorithm For Sequence Prediction Dzmitry Bahdanau et al.
- Neural Text Generation From Structured Data With Application To The Biography Domain Remi Lebret, David Grangier, Michael Auli
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- Frustratingly Short Attention Spans In Neural Language Modeling Michał Daniluk, Tim Rocktäschel, Johannes Welbl, Sebastian Riedel
- DP-GAN: Diversity-promoting Generative Adversarial Network For Generating Informative And Diversified Text Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu Sun
- Maskgan: Better Text Generation Via Filling In The______ William Fedus, Ian Goodfellow, Andrew M. Dai
- Seq2seq-vis: A Visual Debugging Tool For Sequence-to-sequence Models Hendrik Strobelt et al.
- Character-level Language Modeling With Deeper Self-attention Rami Al-rfou, Dokook Choe, Noah Constant, Mandy Guo, Llion Jones
- Learn To Code-switch: Data Augmentation Using Copy Mechanism On Language Modeling Genta Indra Winata, Andrea Madotto, Chien-sheng Wu, Pascale Fung
- Can You Tell Me How To Get Past Sesame Street? Sentence-level Pretraining Beyond Language Modeling Alex Wang et al.
- Sentence Encoders On Stilts: Supplementary Training On Intermediate Labeled-data Tasks Jason Phang, Thibault Févry, Samuel R. Bowman
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Evaluating Text Gans As Language Models Guy Tevet, Gavriel Habib, Vered Shwartz, Jonathan Berant
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- Moverscore: Text Generation Evaluating With Contextualized Embeddings And Earth Mover Distance Wei Zhao et al.
- Transformer-xl: Attentive Language Models Beyond A Fixed-length Context Zihang Dai et al.
- Data-to-text Generation With Entity Modeling Ratish Puduppully, Li Dong, Mirella Lapata
- Counterfactual Story Reasoning And Generation Lianhui Qin et al.
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- ELI5: Long Form Question Answering Angela Fan et al.
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- Non-monotonic Sequential Text Generation Sean Welleck, Kianté Brantley, Hal Iii Daumé, Kyunghyun Cho
- LAMOL: Language Modeling For Lifelong Language Learning Fan-keng Sun, Cheng-hao Ho, Hung-yi Lee
- Neural Assistant: Joint Action Prediction, Response Generation, And Latent Knowledge Reasoning Arvind Neelakantan et al.
- Sticking To The Facts: Confident Decoding For Faithful Data-to-text Generation Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Universal Adversarial Triggers For Attacking And Analyzing NLP Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh
- Non-autoregressive Transformer By Position Learning Yu Bao et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- A Tensorized Transformer For Language Modeling Xindian Ma et al.
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Pay Less Attention With Lightweight And Dynamic Convolutions Felix Wu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli
- Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection Guangxiang Zhao et al.
- Single Headed Attention RNN: Stop Thinking With Your Head Stephen Merity
- Language Modeling With Deep Transformers Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
- Sentence-level Content Planning And Style Specification For Neural Text Generation Xinyu Hua, Lu Wang
- Modeling Graph Structure In Transformer For Better Amr-to-text Generation Jie Zhu et al.
- Commongen: A Constrained Text Generation Challenge For Generative Commonsense Reasoning Bill Yuchen Lin et al.
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- The Bottom-up Evolution Of Representations In The Transformer: A Study With Machine Translation And Language Modeling Objectives Elena Voita, Rico Sennrich, Ivan Titov
- Caire: An Empathetic Neural Chatbot Zhaojiang Lin et al.
- Levenshtein Transformer Jiatao Gu, Changhan Wang, Jake Zhao
- CTRL: A Conditional Transformer Language Model For Controllable Generation Nitish Shirish Keskar, Bryan Mccann, Lav R. Varshney, Caiming Xiong, Richard Socher
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Paraphrasing With Large Language Models Sam Witteveen, Martin Andrews
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Stabilizing Transformers For Reinforcement Learning Emilio Parisotto et al.
- Bp-transformer: Modelling Long-range Context Via Binary Partitioning Zihao Ye, Qipeng Guo, Quan Gan, Xipeng Qiu, Zheng Zhang
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Bertscore: Evaluating Text Generation With BERT Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, Yoav Artzi
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- Augmenting Self-attention With Persistent Memory Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Herve Jegou, Armand Joulin
- Adaptive Attention Span In Transformers Sainbayar Sukhbaatar, Edouard Grave, Piotr Bojanowski, Armand Joulin
- Barack's Wife Hillary: Using Knowledge-graphs For Fact-aware Language Modeling Robert L. Iv Logan, Nelson F. Liu, Matthew E. Peters, Matt Gardner, Sameer Singh
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Plug And Play Language Models: A Simple Approach To Controlled Text Generation Sumanth Dathathri et al.
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- What Makes A Good Conversation? How Controllable Attributes Affect Human Judgments Abigail See, Stephen Roller, Douwe Kiela, Jason Weston
- GLTR: Statistical Detection And Visualization Of Generated Text Sebastian Gehrmann, Hendrik Strobelt, Alexander M. Rush
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- Text Infilling Wanrong Zhu, Zhiting Hu, Eric Xing
- Pretrained Encyclopedia: Weakly Supervised Knowledge-pretrained Language Model Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Fairseq: A Fast, Extensible Toolkit For Sequence Modeling Myle Ott et al.
- Encoder-agnostic Adaptation For Conditional Language Generation Zachary M. Ziegler, Luke Melas-kyriazi, Sebastian Gehrmann, Alexander M. Rush
- Controlled Hallucinations: Learning To Generate Faithfully From Noisy Data Katja Filippova
- Knowledge-driven Data Construction For Zero-shot Evaluation In Commonsense Question Answering Kaixin Ma et al.
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- Unifiedqa: Crossing Format Boundaries With A Single QA System Daniel Khashabi et al.
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- CG-BERT: Conditional Text Generation With BERT For Generalized Few-shot Intent Detection Congying Xia, Chenwei Zhang, Hoang Nguyen, Jiawei Zhang, Philip Yu
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- Knowledge-aware Language Model Pretraining Corby Rosset et al.
- Meaningful Answer Generation Of E-commerce Question-answering Shen Gao, Xiuying Chen, Zhaochun Ren, Dongyan Zhao, Rui Yan
- Enabling Language Models To Fill In The Blanks Chris Donahue, Mina Lee, Percy Liang
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Gpt-too: A Language-model-first Approach For Amr-to-text Generation Manuel Mager et al.
- KG-BART: Knowledge Graph-augmented BART For Generative Commonsense Reasoning Ye Liu, Yao Wan, Lifang He, Hao Peng, Philip S. Yu
- Explaining Question Answering Models Through Text Generation Veronica Latcinnik, Jonathan Berant
- Few-shot Text Generation With Pattern-exploiting Training Timo Schick, Hinrich Schütze
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- Mathematical Reasoning Via Self-supervised Skip-tree Training Markus N. Rabe, Dennis Lee, Kshitij Bansal, Christian Szegedy
- A Simple Language Model For Task-oriented Dialogue Ehsan Hosseini-asl, Bryan Mccann, Chien-sheng Wu, Semih Yavuz, Richard Socher
- Robust Conversational AI With Grounded Text Generation Jianfeng Gao et al.
- Synthesizer: Rethinking Self-attention In Transformer Models Yi Tay et al.
- Exploring And Predicting Transferability Across NLP Tasks Tu Vu et al.
- Genaug: Data Augmentation For Finetuning Text Generators Steven Y. Feng, Varun Gangal, Dongyeop Kang, Teruko Mitamura, Eduard Hovy
- MEGATRON-CNTRL: Controllable Story Generation With External Knowledge Using Large-scale Language Models Peng Xu et al.
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- A Comparison Of LSTM And BERT For Small Corpus Aysu Ezen-can
- GMAT: Global Memory Augmentation For Transformers Ankit Gupta, Jonathan Berant
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Ernie-doc: A Retrospective Long-document Modeling Transformer Siyu Ding et al.
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- Cocon: A Self-supervised Approach For Controlled Text Generation Alvin Chan, Yew-soon Ong, Bill Pung, Aston Zhang, Jie Fu
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- Realtoxicityprompts: Evaluating Neural Toxic Degeneration In Language Models Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, Noah A. Smith
- A Survey Of Knowledge-enhanced Text Generation Wenhao Yu et al.
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- Pre-training Via Paraphrasing Mike Lewis et al.
- Template Guided Text Generation For Task-oriented Dialogue Mihir Kale, Abhinav Rastogi
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Residual Energy-based Models For Text Generation Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'aurelio Ranzato
- Aragpt2: Pre-trained Transformer For Arabic Language Generation Wissam Antoun, Fady Baly, Hazem Hajj
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- Language Generation With Multi-hop Reasoning On Commonsense Knowledge Graph Haozhe Ji et al.
- Generation-augmented Retrieval For Open-domain Question Answering Yuning Mao et al.
- TAP: Text-aware Pre-training For Text-vqa And Text-caption Zhengyuan Yang et al.
- Delight: Deep And Light-weight Transformer Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- The Pile: An 800GB Dataset Of Diverse Text For Language Modeling Leo Gao et al.
- Investigating Pretrained Language Models For Graph-to-text Generation Leonardo F. R. Ribeiro, Martin Schmitt, Hinrich Schütze, Iryna Gurevych
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- Bias Out-of-the-box: An Empirical Analysis Of Intersectional Occupational Biases In Popular Generative Language Models Hannah Kirk et al.
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- Codified Audio Language Modeling Learns Useful Representations For Music Information Retrieval Rodrigo Castellon, Chris Donahue, Percy Liang
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- Thinking Aloud: Dynamic Context Generation Improves Zero-shot Reasoning Performance Of GPT-2 Gregor Betz, Kyle Richardson, Christian Voigt
- MAUVE: Measuring The Gap Between Neural Text And Human Text Using Divergence Frontiers Krishna Pillutla et al.
- Unifying Multimodal Transformer For Bi-directional Image And Text Generation Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu
- Bartscore: Evaluating Generated Text As Text Generation Weizhe Yuan, Graham Neubig, Pengfei Liu
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- Structural Adapters In Pretrained Language Models For Amr-to-text Generation Leonardo F. R. Ribeiro, Yue Zhang, Iryna Gurevych
- Focused Attention Improves Document-grounded Generation Shrimai Prabhumoye, Kazuma Hashimoto, Yingbo Zhou, Alan W Black, Ruslan Salakhutdinov
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Parallel Refinements For Lexically Constrained Text Generation With BART Xingwei He
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Efficient Large Scale Language Modeling With Mixtures Of Experts Mikel Artetxe et al.
- Controllable Generation From Pre-trained Language Models Via Inverse Prompting Xu Zou et al.
- Clip4caption: CLIP For Video Caption Mingkang Tang et al.
- UC2: Universal Cross-lingual Cross-modal Vision-and-language Pre-training Mingyang Zhou et al.
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- Adaptive Semiparametric Language Models Dani Yogatama, Cyprien De Masson D'autume, Lingpeng Kong
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- Zero-shot Recommendation As Language Modeling Damien Sileo, Wout Vossen, Robbe Raymaekers
- MAGMA -- Multimodal Augmentation Of Generative Models Through Adapter-based Finetuning Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
- Dialogue State Tracking With A Language Model Using Schema-driven Prompting Chia-hsuan Lee, Hao Cheng, Mari Ostendorf
- A Token-level Reference-free Hallucination Detection Benchmark For Free-form Text Generation Tianyu Liu et al.
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- Simvlm: Simple Visual Language Model Pretraining With Weak Supervision Zirui Wang et al.
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- Vision Guided Generative Pre-trained Language Models For Multimodal Abstractive Summarization Tiezheng Yu, Wenliang Dai, Zihan Liu, Pascale Fung
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Hindsight: Posterior-guided Training Of Retrievers For Improved Open-ended Generation Ashwin Paranjape, Omar Khattab, Christopher Potts, Matei Zaharia, Christopher D. Manning
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- GLM: General Language Model Pretraining With Autoregressive Blank Infilling Zhengxiao Du et al.
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- On Hallucination And Predictive Uncertainty In Conditional Language Generation Yijun Xiao, William Yang Wang
- Demix Layers: Disentangling Domains For Modular Language Modeling Suchin Gururangan, Mike Lewis, Ari Holtzman, Noah A. Smith, Luke Zettlemoyer
- Dexperts: Decoding-time Controlled Text Generation With Experts And Anti-experts Alisa Liu et al.
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Scifive: A Text-to-text Transformer Model For Biomedical Literature Long N. Phan et al.
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- Long Text Generation By Modeling Sentence-level And Discourse-level Coherence Jian Guan et al.
- Unitab: Unifying Text And Box Outputs For Grounded Vision-language Modeling Zhengyuan Yang et al.
- Unifying Vision-and-language Tasks Via Text Generation Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal
- How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty In Text Generation Using RAVEN R. Thomas Mccoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz
- Pretrained Language Models For Text Generation: A Survey Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-rong Wen
- Few-shot Knowledge Graph-to-text Generation With Pretrained Language Models Junyi Li et al.
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Revisiting Neural Scaling Laws In Language And Vision Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
- Reacc: A Retrieval-augmented Code Completion Framework Shuai Lu et al.
- A Survey On Retrieval-augmented Text Generation Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
- On The Effect Of Pretraining Corpora On In-context Learning By A Large-scale Language Model Seongjin Shin et al.
- One Embedder, Any Task: Instruction-finetuned Text Embeddings Hongjin Su et al.
- A Length-extrapolatable Transformer Yutao Sun et al.
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- Diffusion-lm Improves Controllable Text Generation Xiang Lisa Li, John Thickstun, Ishaan Gulrajani, Percy Liang, Tatsunori B. Hashimoto
- Contrastive Decoding: Open-ended Text Generation As Optimization Xiang Lisa Li et al.
- Alexatm 20B: Few-shot Learning Using A Large-scale Multilingual Seq2seq Model Saleh Soltan et al.
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Diffuseq: Sequence To Sequence Text Generation With Diffusion Models Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- Zerogen: Efficient Zero-shot Learning Via Dataset Generation Jiacheng Ye et al.
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- RARR: Researching And Revising What Language Models Say, Using Language Models Luyu Gao et al.
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Vit5: Pretrained Text-to-text Transformer For Vietnamese Language Generation Long Phan, Hieu Tran, Hieu Nguyen, Trieu H. Trinh
- Memorization Without Overfitting: Analyzing The Training Dynamics Of Large Language Models Kushal Tirumala, Aram H. Markosyan, Luke Zettlemoyer, Armen Aghajanyan
- Is Reinforcement Learning (not) For Natural Language Processing: Benchmarks, Baselines, And Building Blocks For Natural Language Policy Optimization Rajkumar Ramamurthy et al.
- Self-conditioned Embedding Diffusion For Text Generation Robin Strudel et al.
- A Systematic Evaluation Of Large Language Models Of Code Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn
- Pangu-coder: Program Synthesis With Function-level Language Modeling Fenia Christopoulou et al.
- Matcha: Enhancing Visual Language Pretraining With Math Reasoning And Chart Derendering Fangyu Liu et al.
- Compilable Neural Code Generation With Compiler Feedback Xin Wang et al.
- Quark: Controllable Text Generation With Reinforced Unlearning Ximing Lu et al.
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- Neural Pipeline For Zero-shot Data-to-text Generation Zdeněk Kasner, Ondřej Dušek
- Adaprompt: Adaptive Model Training For Prompt-based NLP Yulong Chen et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Fast Inference From Transformers Via Speculative Decoding Yaniv Leviathan, Matan Kalman, Yossi Matias
- Cont: Contrastive Neural Text Generation Chenxin An et al.
- Calibrating Sequence Likelihood Improves Conditional Language Generation Yao Zhao et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Survey Of Hallucination In Natural Language Generation Ziwei Ji et al.
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- Memorizing Transformers Yuhuai Wu, Markus N. Rabe, Delesley Hutchins, Christian Szegedy
- Recurrent Memory Transformer Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier
- Text And Patterns: For Effective Chain Of Thought, It Takes Two To Tango Aman Madaan, Amir Yazdanbakhsh
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- ROSCOE: A Suite Of Metrics For Scoring Step-by-step Reasoning Olga Golovneva et al.
- Generative Spoken Dialogue Language Modeling Tu Anh Nguyen et al.
- Towards A Unified Multi-dimensional Evaluator For Text Generation Ming Zhong et al.
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Confident Adaptive Language Modeling Tal Schuster et al.
- Co-writing Screenplays And Theatre Scripts With Language Models: An Evaluation By Industry Professionals Piotr Mirowski, Kory W. Mathewson, Jaylen Pittman, Richard Evans
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- Give Us The Facts: Enhancing Large Language Models With Knowledge Graphs For Fact-aware Language Modeling Linyao Yang, Hongyang Chen, Zhao Li, Xiao Ding, Xindong Wu
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Increasing Diversity While Maintaining Accuracy: Text Data Generation With Large Language Models And Human Interventions John Joon Young Chung, Ece Kamar, Saleema Amershi
- Large Language Models Cannot Self-correct Reasoning Yet Jie Huang et al.
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- AWQ: Activation-aware Weight Quantization For LLM Compression And Acceleration Ji Lin et al.
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- Chatgpt To Replace Crowdsourcing Of Paraphrases For Intent Classification: Higher Diversity And Comparable Model Robustness Jan Cegin, Jakub Simko, Peter Brusilovsky
- Muse: Text-to-image Generation Via Masked Generative Transformers Huiwen Chang et al.
- Cognitive Mirage: A Review Of Hallucinations In Large Language Models Hongbin Ye, Tong Liu, Aijia Zhang, Wei Hua, Weiqiang Jia
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Chain-of-verification Reduces Hallucination In Large Language Models Shehzaad Dhuliawala et al.
- Factscore: Fine-grained Atomic Evaluation Of Factual Precision In Long Form Text Generation Sewon Min et al.
- Medalign: A Clinician-generated Dataset For Instruction Following With Electronic Medical Records Scott L. Fleming et al.
- How Useful Are Educational Questions Generated By Large Language Models? Sabina Elkins, Ekaterina Kochmar, Jackie C. K. Cheung, Iulian Serban
- A Universal Question-answering Platform For Knowledge Graphs Reham Omar, Ishika Dhall, Panos Kalnis, Essam Mansour
- Chatgpt As A Factual Inconsistency Evaluator For Text Summarization Zheheng Luo, Qianqian Xie, Sophia Ananiadou
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- VISAR: A Human-ai Argumentative Writing Assistant With Visual Programming And Rapid Draft Prototyping Zheng Zhang, Jie Gao, Ranjodh Singh Dhaliwal, Toby Jia-jun Li
- In-context Retrieval-augmented Language Models Ori Ram et al.
- Datatales: Investigating The Use Of Large Language Models For Authoring Data-driven Articles Nicole Sultanum, Arjun Srinivasan
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- Toolformer: Language Models Can Teach Themselves To Use Tools Timo Schick et al.
- Expressive Text-to-image Generation With Rich Text Songwei Ge, Taesung Park, Jun-yan Zhu, Jia-bin Huang
- Do Generative Large Language Models Need Billions Of Parameters? Sia Gholami, Marwan Omar
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- Video-llama: An Instruction-tuned Audio-visual Language Model For Video Understanding Hang Zhang, Xin Li, Lidong Bing
- Augmented Language Models: A Survey Grégoire Mialon et al.
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Macaw-llm: Multi-modal Language Modeling With Image, Audio, Video, And Text Integration Chenyang Lyu et al.
- Reinforced Self-training (rest) For Language Modeling Caglar Gulcehre et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Scaling Transformer To 1M Tokens And Beyond With RMT Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, Mikhail S. Burtsev
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Lamp: When Large Language Models Meet Personalization Alireza Salemi, Sheshera Mysore, Michael Bendersky, Hamed Zamani
- Clipsyntel: CLIP And LLM Synergy For Multimodal Question Summarization In Healthcare Akash Ghosh et al.
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- Collaborative Large Language Model For Recommender Systems Yaochen Zhu, Liang Wu, Qi Guo, Liangjie Hong, Jundong Li
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- Low-rank Adaptation Of Large Language Model Rescoring For Parameter-efficient Speech Recognition Yu Yu et al.
- Speak Foreign Languages With Your Own Voice: Cross-lingual Neural Codec Language Modeling Ziqiang Zhang et al.
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Fine-grained Human Feedback Gives Better Rewards For Language Model Training Zeqiu Wu et al.
- Transformers Are Ssms: Generalized Models And Efficient Algorithms Through Structured State Space Duality Tri Dao, Albert Gu
- Shaping Human-ai Collaboration: Varied Scaffolding Levels In Co-writing With Language Models Paramveer S. Dhillon et al.
- AI Hallucinations: A Misnomer Worth Clarifying Negar Maleki, Balaji Padmanabhan, Kaushik Dutta
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- Xlstm: Extended Long Short-term Memory Maximilian Beck et al.
- Large Language Model (LLM) AI Text Generation Detection Based On Transformer Deep Learning Algorithm Yuhong Mo, Hao Qin, Yushan Dong, Ziyi Zhu, Zhenglin Li
- Llamafactory: Unified Efficient Fine-tuning Of 100+ Language Models Yaowei Zheng et al.
🏷 Large-Scale Training
- Nemo: A Toolkit For Building AI Applications Using Neural Modules Oleksii Kuchaiev et al.
- Fairseq: A Fast, Extensible Toolkit For Sequence Modeling Myle Ott et al.
- Scaling Laws For Neural Language Models Jared Kaplan et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Revisiting Neural Scaling Laws In Language And Vision Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Emergent And Predictable Memorization In Large Language Models Stella Biderman et al.
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- AI And Memory Wall Amir Gholami et al.
🏷 LREC
🏷 Masked Language Model
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- The Bottom-up Evolution Of Representations In The Transformer: A Study With Machine Translation And Language Modeling Objectives Elena Voita, Rico Sennrich, Ivan Titov
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Colake: Contextualized Language And Knowledge Embedding Tianxiang Sun et al.
- To Pretrain Or Not To Pretrain: Examining The Benefits Of Pretraining On Resource Rich Tasks Sinong Wang, Madian Khabsa, Hao Ma
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- Masking As An Efficient Alternative To Finetuning For Pretrained Language Models Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Contextualized Perturbation For Textual Adversarial Attack Dianqi Li et al.
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- GMAT: Global Memory Augmentation For Transformers Ankit Gupta, Jonathan Berant
- Autoprompt: Eliciting Knowledge From Language Models With Automatically Generated Prompts Taylor Shin, Yasaman Razeghi, Robert L. Iv Logan, Eric Wallace, Sameer Singh
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- Pre-training Via Paraphrasing Mike Lewis et al.
- On Learning Universal Representations Across Languages Xiangpeng Wei et al.
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- TAP: Text-aware Pre-training For Text-vqa And Text-caption Zhengyuan Yang et al.
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- BERT, Mbert, Or Bibert? A Study On Contextualized Embeddings For Neural Machine Translation Haoran Xu, Benjamin Van Durme, Kenton Murray
- Vision-and-language Or Vision-for-language? On Cross-modal Influence In Multimodal Transformers Stella Frank, Emanuele Bugliarello, Desmond Elliott
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- Climatebert: A Pretrained Language Model For Climate-related Text Nicolas Webersinke, Mathias Kraus, Julia Anna Bingler, Markus Leippold
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Memorization Without Overfitting: Analyzing The Training Dynamics Of Large Language Models Kushal Tirumala, Aram H. Markosyan, Luke Zettlemoyer, Armen Aghajanyan
- Pangu-coder: Program Synthesis With Function-level Language Modeling Fenia Christopoulou et al.
- LERT: A Linguistically-motivated Pre-trained Language Model Yiming Cui, Wanxiang Che, Shijin Wang, Ting Liu
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
🏷 Merging
- Attention Is All You Need Ashish Vaswani et al.
- Phase Conductor On Multi-layered Attentions For Machine Comprehension Rui Liu, Wei Wei, Weiguang Mao, Maria Chikina
- Simple Fusion: Return Of The Language Model Felix Stahlberg, James Cross, Veselin Stoyanov
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Encode, Tag, Realize: High-precision Text Editing Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn
- Language Modeling With Deep Transformers Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Knowledge Aware Conversation Generation With Explainable Reasoning Over Augmented Graphs Zhibin Liu, Zheng-yu Niu, Hua Wu, Haifeng Wang
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- Jointly Optimizing Diversity And Relevance In Neural Response Generation Xiang Gao et al.
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- Text Infilling Wanrong Zhu, Zhiting Hu, Eric Xing
- Colake: Contextualized Language And Knowledge Embedding Tianxiang Sun et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Adapterdrop: On The Efficiency Of Adapters In Transformers Andreas Rücklé et al.
- Machine Reading Comprehension: The Role Of Contextualized Language Models And Beyond Zhuosheng Zhang, Hai Zhao, Rui Wang
- Dialoguetrm: Exploring The Intra- And Inter-modal Emotional Behaviors In The Conversation Yuzhao Mao et al.
- Probing Pretrained Language Models For Lexical Semantics Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- Crossing The Conversational Chasm: A Primer On Natural Language Processing For Multilingual Task-oriented Dialogue Systems Evgeniia Razumovskaia et al.
- Revisiting The Primacy Of English In Zero-shot Cross-lingual Transfer Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-wei Chang, Kristina Toutanova
- Text Compression-aided Transformer Encoding Zuchao Li et al.
- The Impact Of Multiple Parallel Phrase Suggestions On Email Input And Composition Behaviour Of Native And Non-native English Writers Daniel Buschek, Martin Zürn, Malin Eiband
- Non-invasive Self-attention For Side Information Fusion In Sequential Recommendation Chang Liu et al.
- Vision Guided Generative Pre-trained Language Models For Multimodal Abstractive Summarization Tiezheng Yu, Wenliang Dai, Zihan Liu, Pascale Fung
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- FLAVA: A Foundational Language And Vision Alignment Model Amanpreet Singh et al.
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- Reasoning With Language Model Prompting: A Survey Shuofei Qiao et al.
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- Diffusiondb: A Large-scale Prompt Gallery Dataset For Text-to-image Generative Models Zijie J. Wang et al.
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Diffusion-lm Improves Controllable Text Generation Xiang Lisa Li, John Thickstun, Ishaan Gulrajani, Percy Liang, Tatsunori B. Hashimoto
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Healthprompt: A Zero-shot Learning Paradigm For Clinical Natural Language Processing Sonish Sivarajkumar, Yanshan Wang
- Diffuseq: Sequence To Sequence Text Generation With Diffusion Models Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong
- Benchmarking Large Language Models For Automated Verilog RTL Code Generation Shailja Thakur et al.
- Dall-eval: Probing The Reasoning Skills And Social Biases Of Text-to-image Generation Models Jaemin Cho, Abhay Zala, Mohit Bansal
- Efficient Long-text Understanding With Short-text Models Maor Ivgi, Uri Shaham, Jonathan Berant
- Self-conditioned Embedding Diffusion For Text Generation Robin Strudel et al.
- Vindlu: A Recipe For Effective Video-and-language Pretraining Feng Cheng et al.
- SKILL: Structured Knowledge Infusion For Large Language Models Fedor Moiseev, Zhe Dong, Enrique Alfonseca, Martin Jaggi
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- Language And Culture Internalisation For Human-like Autotelic AI Cédric Colas, Tristan Karch, Clément Moulin-frier, Pierre-yves Oudeyer
- LAION-5B: An Open Large-scale Dataset For Training Next Generation Image-text Models Christoph Schuhmann et al.
- Mplug: Effective And Efficient Vision-language Learning By Cross-modal Skip-connections Chenliang Li et al.
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- Thinking Fast And Slow In Large Language Models Thilo Hagendorff, Sarah Fabi, Michal Kosinski
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Promptsource: An Integrated Development Environment And Repository For Natural Language Prompts Stephen H. Bach et al.
- Conversational Question Answering On Heterogeneous Sources Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum
- From Image To Language: A Critical Analysis Of Visual Question Answering (VQA) Approaches, Challenges, And Opportunities Md Farhan Ishmam, Md Sakib Hossain Shovon, M. F. Mridha, Nilanjan Dey
- Llm-grounded Diffusion: Enhancing Prompt Understanding Of Text-to-image Diffusion Models With Large Language Models Long Lian, Boyi Li, Adam Yala, Trevor Darrell
- Generative Artificial Intelligence In Learning Analytics: Contextualising Opportunities And Challenges Through The Learning Analytics Cycle Lixiang Yan, Roberto Martinez-maldonado, Dragan Gašević
- Automatically Correcting Large Language Models: Surveying The Landscape Of Diverse Self-correction Strategies Liangming Pan et al.
- Layoutllm-t2i: Eliciting Layout Guidance From LLM For Text-to-image Generation Leigang Qu, Shengqiong Wu, Hao Fei, Liqiang Nie, Tat-seng Chua
- Surgicalgpt: End-to-end Language-vision GPT For Visual Question Answering In Surgery Lalithkumar Seenivasan, Mobarakol Islam, Gokul Kannan, Hongliang Ren
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- Ai-augmented Surveys: Leveraging Large Language Models And Surveys For Opinion Prediction Junsol Kim, Byungkyu Lee
- Exploring The Benefits Of Training Expert Language Models Over Instruction Tuning Joel Jang et al.
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- Unlearn What You Want To Forget: Efficient Unlearning For Llms Jiaao Chen, Diyi Yang
- AWQ: Activation-aware Weight Quantization For LLM Compression And Acceleration Ji Lin et al.
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- The Curse Of Recursion: Training On Generated Data Makes Models Forget Ilia Shumailov et al.
- Muse: Text-to-image Generation Via Masked Generative Transformers Huiwen Chang et al.
- A Comprehensive Overview Of Large Language Models Humza Naveed et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- Chatgpt Is Not All You Need. A State Of The Art Review Of Large Generative AI Models Roberto Gozalo-brizuela, Eduardo C. Garrido-merchan
- Beyond Memorization: Violating Privacy Via Inference With Large Language Models Robin Staab, Mark Vero, Mislav Balunović, Martin Vechev
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- Grounded Text-to-image Synthesis With Attention Refocusing Quynh Phung, Songwei Ge, Jia-bin Huang
- AI Transparency In The Age Of Llms: A Human-centered Research Roadmap Q. Vera Liao, Jennifer Wortman Vaughan
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Transformative Effects Of Chatgpt On Modern Education: Emerging Era Of AI Chatbots Sukhpal Singh Gill et al.
- Expressive Text-to-image Generation With Rich Text Songwei Ge, Taesung Park, Jun-yan Zhu, Jia-bin Huang
- Mind Meets Machine: Unravelling Gpt-4's Cognitive Psychology Sifatkaur Dhingra, Manmeet Singh, Vaisakh Sb, Neetiraj Malviya, Sukhpal Singh Gill
- Do Llms Understand User Preferences? Evaluating Llms On User Rating Prediction Wang-cheng Kang et al.
- The Troubling Emergence Of Hallucination In Large Language Models -- An Extensive Definition, Quantification, And Prescriptive Remediations Vipula Rawte et al.
- Creativity Support In The Age Of Large Language Models: An Empirical Study Involving Emerging Writers Tuhin Chakrabarty, Vishakh Padmakumar, Faeze Brahman, Smaranda Muresan
- Personallm: Investigating The Ability Of Large Language Models To Express Personality Traits Hang Jiang et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- Large Language Models For Generative Information Extraction: A Survey Derong Xu et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- Receive, Reason, And React: Drive As You Say With Large Language Models In Autonomous Vehicles Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang
- The False Promise Of Imitating Proprietary Llms Arnav Gudibande et al.
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Generative AI: Implications And Applications For Education Anastasia Olnancy Olga et al.
- Robots That Ask For Help: Uncertainty Alignment For Large Language Model Planners Allen Z. Ren et al.
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Beyond Chain-of-thought, Effective Graph-of-thought Reasoning In Language Models Yao Yao, Zuchao Li, Hai Zhao
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- Flexgen: High-throughput Generative Inference Of Large Language Models With A Single GPU Ying Sheng et al.
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- Key-locked Rank One Editing For Text-to-image Personalization Yoad Tewel, Rinon Gal, Gal Chechik, Yuval Atzmon
- The Dark Side Of Chatgpt: Legal And Ethical Challenges From Stochastic Parrots And Hallucination Zihao Li
- Delving Into Multimodal Prompting For Fine-grained Visual Classification Xin Jiang et al.
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- Hard Prompts Made Easy: Gradient-based Discrete Optimization For Prompt Tuning And Discovery Yuxin Wen et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- Fine-tuned Language Models Generate Stable Inorganic Materials As Text Nate Gruver et al.
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Recent Advances In Generative AI And Large Language Models: Current Status, Challenges, And Perspectives Desta Haileselassie Hagos, Rick Battle, Danda B. Rawat
- Rethinking Interpretability In The Era Of Large Language Models Chandan Singh, Jeevana Priya Inala, Michel Galley, Rich Caruana, Jianfeng Gao
- Large Language Model (LLM) AI Text Generation Detection Based On Transformer Deep Learning Algorithm Yuhong Mo, Hao Qin, Yushan Dong, Ziyi Zhu, Zhenglin Li
- Large Language Models In Mental Health Care: A Scoping Review Yining Hua et al.
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Biomistral: A Collection Of Open-source Pretrained Large Language Models For Medical Domains Yanis Labrak et al.
🏷 Model Architecture
- Programming With A Differentiable Forth Interpreter Matko Bošnjak, Tim Rocktäschel, Jason Naradowsky, Sebastian Riedel
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- Topic Aware Neural Response Generation Chen Xing et al.
- Separating Answers From Queries For Neural Reading Comprehension Dirk Weissenborn
- Generative Deep Neural Networks For Dialogue: A Short Review Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, Joelle Pineau
- Learning Python Code Suggestion With A Sparse Pointer Network Avishkar Bhoopchand, Tim Rocktäschel, Earl Barr, Sebastian Riedel
- Attention Strategies For Multi-source Sequence-to-sequence Learning Jindřich Libovický, Jindřich Helcl
- Attention Is All You Need Ashish Vaswani et al.
- Weighted Transformer Network For Machine Translation Karim Ahmed, Nitish Shirish Keskar, Richard Socher
- Phase Conductor On Multi-layered Attentions For Machine Comprehension Rui Liu, Wei Wei, Weiguang Mao, Maria Chikina
- End-to-end Optimization Of Goal-driven And Visually Grounded Dialogue Systems Florian Strub et al.
- Batch Policy Gradient Methods For Improving Neural Conversation Models Kirthevasan Kandasamy, Yoram Bachrach, Ryota Tomioka, Daniel Tarlow, David Carter
- Gated-attention Architectures For Task-oriented Language Grounding Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov
- Frustratingly Short Attention Spans In Neural Language Modeling Michał Daniluk, Tim Rocktäschel, Johannes Welbl, Sebastian Riedel
- A Unified Query-based Generative Model For Question Generation And Question Answering Linfeng Song, Zhiguo Wang, Wael Hamza
- Parlai: A Dialog Research Software Platform Alexander H. Miller et al.
- A Deep Reinforcement Learning Chatbot Iulian V. Serban et al.
- The Memad Submission To The WMT18 Multimodal Translation Task Stig-arne Grönroos et al.
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- Character-level Language Modeling With Deeper Self-attention Rami Al-rfou, Dokook Choe, Noah Constant, Mandy Guo, Llion Jones
- Extending Neural Generative Conversational Model Using External Knowledge Sources Prasanna Parthasarathi, Joelle Pineau
- Adversarially Regularising Neural NLI Models To Integrate Logical Background Knowledge Pasquale Minervini, Sebastian Riedel
- Learn To Code-switch: Data Augmentation Using Copy Mechanism On Language Modeling Genta Indra Winata, Andrea Madotto, Chien-sheng Wu, Pascale Fung
- Commonsense For Generative Multi-hop Question Answering Tasks Lisa Bauer, Yicheng Wang, Mohit Bansal
- Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context Urvashi Khandelwal, He He, Peng Qi, Dan Jurafsky
- Can You Tell Me How To Get Past Sesame Street? Sentence-level Pretraining Beyond Language Modeling Alex Wang et al.
- Wizard Of Wikipedia: Knowledge-powered Conversational Agents Emily Dinan et al.
- Sdnet: Contextualized Attention-based Deep Network For Conversational Question Answering Chenguang Zhu, Michael Zeng, Xuedong Huang
- Sentence Encoders On Stilts: Supplementary Training On Intermediate Labeled-data Tasks Jason Phang, Thibault Févry, Samuel R. Bowman
- An Affect-rich Neural Conversational Model With Biased Attention And Weighted Cross-entropy Loss Peixiang Zhong, Di Wang, Chunyan Miao
- A Dataset For Document Grounded Conversations Kangyan Zhou, Shrimai Prabhumoye, Alan W Black
- Improving The Transformer Translation Model With Document-level Context Jiacheng Zhang et al.
- Disentangling Language And Knowledge In Task-oriented Dialogs Dinesh Raghu, Nikhil Gupta, Mausam
- Training Tips For The Transformer Model Martin Popel, Ondřej Bojar
- Tensor2tensor For Neural Machine Translation Ashish Vaswani et al.
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Topic-based Evaluation For Conversational Bots Fenfei Guo et al.
- "bilingual Expert" Can Find Translation Errors Kai Fan et al.
- Simple Fusion: Return Of The Language Model Felix Stahlberg, James Cross, Veselin Stoyanov
- Multi-cast Attention Networks For Retrieval-based Question Answering And Response Prediction Yi Tay, Luu Anh Tuan, Siu Cheung Hui
- BERT: Pre-training Of Deep Bidirectional Transformers For Language Understanding Jacob Devlin, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Is Multilingual BERT Fluent In Language Generation? Samuel Rönnqvist, Jenna Kanerva, Tapio Salakoski, Filip Ginter
- Multi-passage BERT: A Globally Normalized BERT Model For Open-domain Question Answering Zhiguo Wang, Patrick Ng, Xiaofei Ma, Ramesh Nallapati, Bing Xiang
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- Recosa: Detecting The Relevant Contexts With Self-attention For Multi-turn Dialogue Generation Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng
- Efficient Adaptation Of Pretrained Transformers For Abstractive Summarization Andrew Hoang, Antoine Bosselut, Asli Celikyilmaz, Yejin Choi
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Reqa: An Evaluation For End-to-end Answer Retrieval Models Amin Ahmad, Noah Constant, Yinfei Yang, Daniel Cer
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Bert4rec: Sequential Recommendation With Bidirectional Encoder Representations From Transformer Fei Sun et al.
- Entity-consistent End-to-end Task-oriented Dialogue System With KB Retriever Libo Qin et al.
- Attention Is Not Explanation Sarthak Jain, Byron C. Wallace
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Transformer-xl: Attentive Language Models Beyond A Fixed-length Context Zihang Dai et al.
- Revealing The Dark Secrets Of BERT Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky
- Align, Mask And Select: A Simple Method For Incorporating Commonsense Knowledge Into Language Representation Models Zhi-xiu Ye, Qian Chen, Wen Wang, Zhen-hua Ling
- Olmpics -- On What Language Model Pre-training Captures Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant
- Multiqa: An Empirical Investigation Of Generalization And Transfer In Reading Comprehension Alon Talmor, Jonathan Berant
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Fully Quantized Transformer For Machine Translation Gabriele Prato, Ella Charlaix, Mehdi Rezagholizadeh
- Understanding The Behaviors Of BERT In Ranking Yifan Qiao, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu
- Data-to-text Generation With Entity Modeling Ratish Puduppully, Li Dong, Mirella Lapata
- Generating Persona Consistent Dialogues By Exploiting Natural Language Inference Haoyu Song, Wei-nan Zhang, Jingwen Hu, Ting Liu
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Contextualized Sparse Representations For Real-time Open-domain Question Answering Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang
- Answering Complex Open-domain Questions Through Iterative Query Generation Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, Christopher D. Manning
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- Pretrained Language Models For Sequential Sentence Classification Arman Cohan, Iz Beltagy, Daniel King, Bhavana Dalvi, Daniel S. Weld
- Pretrained Language Models For Document-level Neural Machine Translation Liangyou Li, Xin Jiang, Qun Liu
- PEGASUS: Pre-training With Extracted Gap-sentences For Abstractive Summarization Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu
- Language Models As Knowledge Bases? Fabio Petroni et al.
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- Neural Assistant: Joint Action Prediction, Response Generation, And Latent Knowledge Reasoning Arvind Neelakantan et al.
- Exploiting Persona Information For Diverse Generation Of Conversational Responses Haoyu Song, Wei-nan Zhang, Yiming Cui, Dong Wang, Ting Liu
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Camembert: A Tasty French Language Model Louis Martin et al.
- Universal Adversarial Triggers For Attacking And Analyzing NLP Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh
- Approximating Interactive Human Evaluation With Self-play For Open-domain Dialog Systems Asma Ghandeharioun et al.
- Non-autoregressive Transformer By Position Learning Yu Bao et al.
- Dialogue Transformers Vladimir Vlasov, Johannes E. M. Mosig, Alan Nichol
- Encode, Tag, Realize: High-precision Text Editing Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn
- Compressive Transformers For Long-range Sequence Modelling Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- UER: An Open-source Toolkit For Pre-training Models Zhe Zhao et al.
- A Tensorized Transformer For Language Modeling Xindian Ma et al.
- Multi-step Retriever-reader Interaction For Scalable Open-domain Question Answering Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew Mccallum
- Pythia: Ai-assisted Code Completion System Alexey Svyatkovskiy, Ying Zhao, Shengyu Fu, Neel Sundaresan
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Pay Less Attention With Lightweight And Dynamic Convolutions Felix Wu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli
- Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection Guangxiang Zhao et al.
- A Pre-training Based Personalized Dialogue Generation Model With Persona-sparse Data Yinhe Zheng, Rongsheng Zhang, Xiaoxi Mao, Minlie Huang
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- BERT Has A Mouth, And It Must Speak: BERT As A Markov Random Field Language Model Alex Wang, Kyunghyun Cho
- Cloze-driven Pretraining Of Self-attention Networks Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- Linking Artificial And Human Neural Representations Of Language Jon Gauthier, Roger Levy
- MUSE: Parallel Multi-scale Attention For Sequence To Sequence Learning Guangxiang Zhao, Xu Sun, Jingjing Xu, Zhiyuan Zhang, Liangchen Luo
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Single Headed Attention RNN: Stop Thinking With Your Head Stephen Merity
- Repurposing Entailment For Multi-hop Question Answering Tasks Harsh Trivedi, Heeyoung Kwon, Tushar Khot, Ashish Sabharwal, Niranjan Balasubramanian
- Language Modeling With Deep Transformers Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
- Frustratingly Easy Natural Question Answering Lin Pan et al.
- Zero: Memory Optimizations Toward Training Trillion Parameter Models Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Modeling Graph Structure In Transformer For Better Amr-to-text Generation Jie Zhu et al.
- Myers-briggs Personality Classification And Personality-specific Language Generation Using Pre-trained Language Models Sedrick Scott Keh, I-tsun Cheng
- Gpt-based Generation For Classical Chinese Poetry Yi Liao, Yasheng Wang, Qun Liu, Xin Jiang
- Modeling Recurrence For Transformer Jie Hao et al.
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- Transfertransfo: A Transfer Learning Approach For Neural Network Based Conversational Agents Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
- The Second Conversational Intelligence Challenge (convai2) Emily Dinan et al.
- An Effective Domain Adaptive Post-training Method For BERT In Response Selection Taesun Whang et al.
- The Bottom-up Evolution Of Representations In The Transformer: A Study With Machine Translation And Language Modeling Objectives Elena Voita, Rico Sennrich, Ivan Titov
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- What Would Elsa Do? Freezing Layers During Transformer Fine-tuning Jaejun Lee, Raphael Tang, Jimmy Lin
- Exbert: A Visual Analysis Tool To Explore Learned Representations In Transformers Models Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann
- Adding Interpretable Attention To Neural Translation Models Improves Word Alignment Thomas Zenkel, Joern Wuebker, John Denero
- VL-BERT: Pre-training Of Generic Visual-linguistic Representations Weijie Su et al.
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Analyzing Multi-head Self-attention: Specialized Heads Do The Heavy Lifting, The Rest Can Be Pruned Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- The Evolved Transformer David R. So, Chen Liang, Quoc V. Le
- A Simple But Effective Method To Incorporate Multi-turn Context With BERT For Conversational Machine Comprehension Yasuhito Ohsugi, Itsumi Saito, Kyosuke Nishida, Hisako Asano, Junji Tomita
- Insertion-based Decoding With Automatically Inferred Generation Order Jiatao Gu, Qi Liu, Kyunghyun Cho
- Scheduled Sampling For Transformers Tsvetomila Mihaylova, André F. T. Martins
- Levenshtein Transformer Jiatao Gu, Changhan Wang, Jake Zhao
- Interpreting And Improving Natural-language Processing (in Machines) With Natural Language-processing (in The Brain) Mariya Toneva, Leila Wehbe
- On The Use Of BERT For Neural Machine Translation Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Multi-hop Question Answering Via Reasoning Chains Jifan Chen, Shih-ting Lin, Greg Durrett
- CTRL: A Conditional Transformer Language Model For Controllable Generation Nitish Shirish Keskar, Bryan Mccann, Lav R. Varshney, Caiming Xiong, Richard Socher
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Berts Of A Feather Do Not Generalize Together: Large Variability In Generalization Across Models With Similar Test Set Performance R. Thomas Mccoy, Junghyun Min, Tal Linzen
- Paraphrasing With Large Language Models Sam Witteveen, Martin Andrews
- Semantics-aware BERT For Language Understanding Zhuosheng Zhang et al.
- Data Augmentation For BERT Fine-tuning In Open-domain Question Answering Wei Yang et al.
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Stabilizing Transformers For Reinforcement Learning Emilio Parisotto et al.
- Learning And Evaluating General Linguistic Intelligence Dani Yogatama et al.
- Dialogpt: Large-scale Generative Pre-training For Conversational Response Generation Yizhe Zhang et al.
- Deep Learning Based Chatbot Models Richard Csaky
- Inducing Brain-relevant Bias In Natural Language Processing Models Dan Schwartz, Mariya Toneva, Leila Wehbe
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Parameter-efficient Transfer Learning For NLP Neil Houlsby et al.
- Controlling The Output Length Of Neural Machine Translation Surafel Melaku Lakew, Mattia Di Gangi, Marcello Federico
- Bp-transformer: Modelling Long-range Context Via Binary Partitioning Zihao Ye, Qipeng Guo, Quan Gan, Xipeng Qiu, Zheng Zhang
- Automatic Spanish Translation Of The Squad Dataset For Multilingual Question Answering Casimiro Pio Carrino, Marta R. Costa-jussà, José A. R. Fonollosa
- Visualizing And Understanding The Effectiveness Of BERT Yaru Hao, Li Dong, Furu Wei, Ke Xu
- Do Attention Heads In BERT Track Syntactic Dependencies? Phu Mon Htut, Jason Phang, Shikha Bordia, Samuel R. Bowman
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Bertscore: Evaluating Text Generation With BERT Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, Yoav Artzi
- Attentive History Selection For Conversational Question Answering Chen Qu et al.
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- Augmenting Self-attention With Persistent Memory Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Herve Jegou, Armand Joulin
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Adaptive Attention Span In Transformers Sainbayar Sukhbaatar, Edouard Grave, Piotr Bojanowski, Armand Joulin
- Learning And Evaluating Contextual Embedding Of Source Code Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
- Linguistic Knowledge And Transferability Of Contextual Representations Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Plug And Play Language Models: A Simple Approach To Controlled Text Generation Sumanth Dathathri et al.
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- A Survey Of Natural Language Generation Techniques With A Focus On Dialogue Systems - Past, Present And Future Directions Sashank Santhanam, Samira Shaikh
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Adapting And Evaluating A Deep Learning Language Model For Clinical Why-question Answering Andrew Wen, Mohamed Y. Elwazir, Sungrim Moon, Jungwei Fan
- Do Massively Pretrained Language Models Make Better Storytellers? Abigail See, Aneesh Pappu, Rohun Saxena, Akhila Yerukola, Christopher D. Manning
- Roberta: A Robustly Optimized BERT Pretraining Approach Yinhan Liu et al.
- Fast Transformer Decoding: One Write-head Is All You Need Noam Shazeer
- Cosmos QA: Machine Reading Comprehension With Contextual Commonsense Reasoning Lifu Huang, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Semantically Conditioned Dialog Response Generation Via Hierarchical Disentangled Self-attention Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, William Yang Wang
- Bridging The Gap For Tokenizer-free Language Models Dokook Choe, Rami Al-rfou, Mandy Guo, Heeyoung Lee, Noah Constant
- Do Neural Dialog Systems Use The Conversation History Effectively? An Empirical Study Chinnadhurai Sankar, Sandeep Subramanian, Christopher Pal, Sarath Chandar, Yoshua Bengio
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- ACUTE-EVAL: Improved Dialogue Evaluation With Optimized Questions And Multi-turn Comparisons Margaret Li, Jason Weston, Stephen Roller
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Synthetic QA Corpora Generation With Roundtrip Consistency Chris Alberti, Daniel Andor, Emily Pitler, Jacob Devlin, Michael Collins
- Context-aware Learning For Neural Machine Translation Sébastien Jean, Kyunghyun Cho
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- Microsoft Translator At WMT 2019: Towards Large-scale Document-level Neural Machine Translation Marcin Junczys-dowmunt
- Synchronous Bidirectional Inference For Neural Sequence Generation Jiajun Zhang, Long Zhou, Yang Zhao, Chengqing Zong
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- ALBERT: A Lite BERT For Self-supervised Learning Of Language Representations Zhenzhong Lan et al.
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Beto, Bentz, Becas: The Surprising Cross-lingual Effectiveness Of BERT Shijie Wu, Mark Dredze
- What Does BERT Learn From Multiple-choice Reading Comprehension Datasets? Chenglei Si, Shuohang Wang, Min-yen Kan, Jing Jiang
- Improving Knowledge-aware Dialogue Generation Via Knowledge Base Question Answering Jian Wang et al.
- Text Infilling Wanrong Zhu, Zhiting Hu, Eric Xing
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Transfer Fine-tuning: A BERT Case Study Yuki Arase, Junichi Tsujii
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- Pretrained Encyclopedia: Weakly Supervised Knowledge-pretrained Language Model Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- A Modular Task-oriented Dialogue System Using A Neural Mixture-of-experts Jiahuan Pei, Pengjie Ren, Maarten De Rijke
- TANDA: Transfer And Adapt Pre-trained Transformer Models For Answer Sentence Selection Siddhant Garg, Thuy Vu, Alessandro Moschitti
- Incremental Transformer With Deliberation Decoder For Document Grounded Conversations Zekang Li et al.
- Exploring The Limits Of Transfer Learning With A Unified Text-to-text Transformer Colin Raffel et al.
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- Attention-informed Mixed-language Training For Zero-shot Cross-lingual Task-oriented Dialogue Systems Zihan Liu, Genta Indra Winata, Zhaojiang Lin, Peng Xu, Pascale Fung
- Encoder-agnostic Adaptation For Conditional Language Generation Zachary M. Ziegler, Luke Melas-kyriazi, Sebastian Gehrmann, Alexander M. Rush
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- Harnessing Evolution Of Multi-turn Conversations For Effective Answer Retrieval Mohammad Aliannejadi, Manajit Chakraborty, Esteban Andrés Ríssola, Fabio Crestani
- Sg-net: Syntax-guided Machine Reading Comprehension Zhuosheng Zhang et al.
- Modifying Memories In Transformer Models Chen Zhu et al.
- Colake: Contextualized Language And Knowledge Embedding Tianxiang Sun et al.
- UNIMO: Towards Unified-modal Understanding And Generation Via Cross-modal Contrastive Learning Wei Li et al.
- Low-rank Bottleneck In Multi-head Attention Models Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
- Open-retrieval Conversational Question Answering Chen Qu et al.
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- Controlled Hallucinations: Learning To Generate Faithfully From Noisy Data Katja Filippova
- SEAL: Segment-wise Extractive-abstractive Long-form Text Summarization Yao Zhao, Mohammad Saleh, Peter J. Liu
- XGLUE: A New Benchmark Dataset For Cross-lingual Pre-training, Understanding And Generation Yaobo Liang et al.
- Linformer: Self-attention With Linear Complexity Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, Hao Ma
- To Pretrain Or Not To Pretrain: Examining The Benefits Of Pretraining On Resource Rich Tasks Sinong Wang, Madian Khabsa, Hao Ma
- Unsupervised Paraphrase Generation Using Pre-trained Language Models Chaitra Hegde, Shrikumar Patil
- Measuring Systematic Generalization In Neural Proof Generation With Transformers Nicolas Gontier, Koustuv Sinha, Siva Reddy, Christopher Pal
- SPARTA: Efficient Open-domain Question Answering Via Sparse Transformer Matching Retrieval Tiancheng Zhao, Xiaopeng Lu, Kyusong Lee
- Bert-of-theseus: Compressing BERT By Progressive Module Replacing Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou
- Transformers As Soft Reasoners Over Language Peter Clark, Oyvind Tafjord, Kyle Richardson
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- Sequential Latent Knowledge Selection For Knowledge-grounded Dialogue Byeongchang Kim, Jaewoo Ahn, Gunhee Kim
- Russiansuperglue: A Russian Language Understanding Evaluation Benchmark Tatiana Shavrina et al.
- Progressive Generation Of Long Text With Pretrained Language Models Bowen Tan, Zichao Yang, Maruan Ai-shedivat, Eric P. Xing, Zhiting Hu
- Artificial Intelligence Versus Maya Angelou: Experimental Evidence That People Cannot Differentiate Ai-generated From Human-written Poetry Nils Köbis, Luca Mossink
- The Radicalization Risks Of GPT-3 And Advanced Neural Language Models Kris Mcguffie, Alex Newhouse
- KRISP: Integrating Implicit And Symbolic Knowledge For Open-domain Knowledge-based VQA Kenneth Marino, Xinlei Chen, Devi Parikh, Abhinav Gupta, Marcus Rohrbach
- Fine-tuning Pretrained Language Models: Weight Initializations, Data Orders, And Early Stopping Jesse Dodge et al.
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- Phobert: Pre-trained Language Models For Vietnamese Dat Quoc Nguyen, Anh Tuan Nguyen
- The Chess Transformer: Mastering Play Using Generative Language Models David Noever, Matt Ciolino, Josh Kalin
- Injecting Numerical Reasoning Skills Into Language Models Mor Geva, Ankit Gupta, Jonathan Berant
- Conversational Question Reformulation Via Sequence-to-sequence Architectures And Pretrained Language Models Sheng-chieh Lin et al.
- Intermediate-task Transfer Learning With Pretrained Models For Natural Language Understanding: When And Why Does It Work? Yada Pruksachatkun et al.
- Pretrained Transformers For Simple Question Answering Over Knowledge Graphs D. Lukovnikov, A. Fischer, J. Lehmann
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- CG-BERT: Conditional Text Generation With BERT For Generalized Few-shot Intent Detection Congying Xia, Chenwei Zhang, Hoang Nguyen, Jiawei Zhang, Philip Yu
- How Effective Is Task-agnostic Data Augmentation For Pretrained Transformers? Shayne Longpre, Yu Wang, Christopher Dubois
- Pymt5: Multi-mode Translation Of Natural Language And Python Code With Transformers Colin B. Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, Neel Sundaresan
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Deebert: Dynamic Early Exiting For Accelerating BERT Inference Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin
- Speaker-aware BERT For Multi-turn Response Selection In Retrieval-based Chatbots Jia-chen Gu et al.
- Unnatural Language Inference Koustuv Sinha, Prasanna Parthasarathi, Joelle Pineau, Adina Williams
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- Knowledge-aware Language Model Pretraining Corby Rosset et al.
- Mapping Natural Language Instructions To Mobile UI Action Sequences Yang Li, Jiacong He, Xin Zhou, Yuan Zhang, Jason Baldridge
- Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies For Multi-turn Response Selection Taesun Whang et al.
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- Masking As An Efficient Alternative To Finetuning For Pretrained Language Models Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze
- Exploring Fine-tuning Techniques For Pre-trained Cross-lingual Models Via Continual Learning Zihan Liu, Genta Indra Winata, Andrea Madotto, Pascale Fung
- Code Prediction By Feeding Trees To Transformers Seohyun Kim, Jinman Zhao, Yuchi Tian, Satish Chandra
- Enabling Language Models To Fill In The Blanks Chris Donahue, Mina Lee, Percy Liang
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- A Knowledge-enhanced Pretraining Model For Commonsense Story Generation Jian Guan, Fei Huang, Zhihao Zhao, Xiaoyan Zhu, Minlie Huang
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- Gpt-too: A Language-model-first Approach For Amr-to-text Generation Manuel Mager et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Compressing Large-scale Transformer-based Models: A Case Study On BERT Prakhar Ganesh et al.
- KG-BART: Knowledge Graph-augmented BART For Generative Commonsense Reasoning Ye Liu, Yao Wan, Lifang He, Hao Peng, Philip S. Yu
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- IART: Intent-aware Response Ranking With Transformers In Information-seeking Conversation Systems Liu Yang et al.
- Few-shot Generative Conversational Query Rewriting Shi Yu et al.
- Explaining Question Answering Models Through Text Generation Veronica Latcinnik, Jonathan Berant
- EDITOR: An Edit-based Transformer With Repositioning For Neural Machine Translation With Soft Lexical Constraints Weijia Xu, Marine Carpuat
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- It's Not Just Size That Matters: Small Language Models Are Also Few-shot Learners Timo Schick, Hinrich Schütze
- Sequence-level Mixed Sample Data Augmentation Demi Guo, Yoon Kim, Alexander M. Rush
- Coreferential Reasoning Learning For Language Representation Deming Ye et al.
- A Simple But Tough-to-beat Data Augmentation Approach For Natural Language Understanding And Generation Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- Inducing Language-agnostic Multilingual Representations Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- Gshard: Scaling Giant Models With Conditional Computation And Automatic Sharding Dmitry Lepikhin et al.
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- A Simple Language Model For Task-oriented Dialogue Ehsan Hosseini-asl, Bryan Mccann, Chien-sheng Wu, Semih Yavuz, Richard Socher
- GRUEN For Evaluating Linguistic Quality Of Generated Text Wanzheng Zhu, Suma Bhat
- Rapidly Bootstrapping A Question Answering Dataset For COVID-19 Raphael Tang et al.
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- Query Resolution For Conversational Search With Limited Supervision Nikos Voskarides, Dan Li, Pengjie Ren, Evangelos Kanoulas, Maarten De Rijke
- Visbert: Hidden-state Visualizations For Transformers Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Long Range Arena: A Benchmark For Efficient Transformers Yi Tay et al.
- When Being Unseen From Mbert Is Just The Beginning: Handling New Languages With Multilingual Language Models Benjamin Muller, Antonis Anastasopoulos, Benoît Sagot, Djamé Seddah
- Training Large Neural Networks With Constant Memory Using A New Execution Algorithm Bharadwaj Pudipeddi, Maral Mesmakhosroshahi, Jinwen Xi, Sujeeth Bharadwaj
- Tabert: Pretraining For Joint Understanding Of Textual And Tabular Data Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel
- Gedi: Generative Discriminator Guided Sequence Generation Ben Krause et al.
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- PONE: A Novel Automatic Evaluation Metric For Open-domain Generative Dialogue Systems Tian Lan, Xian-ling Mao, Wei Wei, Xiaoyan Gao, Heyan Huang
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- Few-shot Natural Language Generation For Task-oriented Dialog Baolin Peng et al.
- DUMA: Reading Comprehension With Transposition Thinking Pengfei Zhu, Hai Zhao, Xiaoguang Li
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- Robust Conversational AI With Grounded Text Generation Jianfeng Gao et al.
- Synthesizer: Rethinking Self-attention In Transformer Models Yi Tay et al.
- When Do You Need Billions Of Words Of Pretraining Data? Yian Zhang, Alex Warstadt, Haau-sing Li, Samuel R. Bowman
- Bert-hlstms: BERT And Hierarchical Lstms For Visual Storytelling Jing Su, Qingyun Dai, Frank Guerin, Mian Zhou
- Beyond I.I.D.: Three Levels Of Generalization For Question Answering On Knowledge Bases Yu Gu et al.
- Genaug: Data Augmentation For Finetuning Text Generators Steven Y. Feng, Varun Gangal, Dongyeop Kang, Teruko Mitamura, Eduard Hovy
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- MART: Memory-augmented Recurrent Transformer For Coherent Video Paragraph Captioning Jie Lei et al.
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- A Comparison Of LSTM And BERT For Small Corpus Aysu Ezen-can
- GMAT: Global Memory Augmentation For Transformers Ankit Gupta, Jonathan Berant
- A Large-scale Chinese Short-text Conversation Dataset Yida Wang et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- Talking-heads Attention Noam Shazeer, Zhenzhong Lan, Youlong Cheng, Nan Ding, Le Hou
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Adapterdrop: On The Efficiency Of Adapters In Transformers Andreas Rücklé et al.
- Language Models As Few-shot Learner For Task-oriented Dialogue Systems Andrea Madotto, Zihan Liu, Zhaojiang Lin, Pascale Fung
- Natural Language Rationales With Full-stack Visual Reasoning: From Pixels To Semantic Frames To Commonsense Graphs Ana Marasović et al.
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- Syntactic Data Augmentation Increases Robustness To Inference Heuristics Junghyun Min, R. Thomas Mccoy, Dipanjan Das, Emily Pitler, Tal Linzen
- End-to-end Synthetic Data Generation For Domain Adaptation Of Question Answering Systems Siamak Shakeri et al.
- What Happens To BERT Embeddings During Fine-tuning? Amil Merchant, Elahe Rahimtoroghi, Ellie Pavlick, Ian Tenney
- Ernie-doc: A Retrospective Long-document Modeling Transformer Siyu Ding et al.
- Are We Pretraining It Right? Digging Deeper Into Visio-linguistic Pretraining Amanpreet Singh, Vedanuj Goswami, Devi Parikh
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- Relevance-guided Supervision For Openqa With Colbert Omar Khattab, Christopher Potts, Matei Zaharia
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- Proofwriter: Generating Implications, Proofs, And Abductive Statements Over Natural Language Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark
- Cocon: A Self-supervised Approach For Controlled Text Generation Alvin Chan, Yew-soon Ong, Bill Pung, Aston Zhang, Jie Fu
- Leap-of-thought: Teaching Pre-trained Models To Systematically Reason Over Implicit Knowledge Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Recall And Learn: Fine-tuning Deep Pretrained Language Models With Less Forgetting Sanyuan Chen et al.
- Non-autoregressive Machine Translation With Disentangled Context Transformer Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
- Coregen: Contextualized Code Representation Learning For Commit Message Generation Lun Yiu Nie et al.
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- TRANS-BLSTM: Transformer With Bidirectional LSTM For Language Understanding Zhiheng Huang, Peng Xu, Davis Liang, Ajay Mishra, Bing Xiang
- Auto-captions On GIF: A Large-scale Video-sentence Dataset For Vision-language Pre-training Yingwei Pan et al.
- Intellicode Compose: Code Generation Using Transformer Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan
- Fine-tuning BERT For Schema-guided Zero-shot Dialogue State Tracking Yu-ping Ruan, Zhen-hua Ling, Jia-chen Gu, Quan Liu
- GREEK-BERT: The Greeks Visiting Sesame Street John Koutsikakis, Ilias Chalkidis, Prodromos Malakasiotis, Ion Androutsopoulos
- Improving Natural Language Processing Tasks With Human Gaze-guided Neural Attention Ekta Sood, Simon Tannert, Philipp Mueller, Andreas Bulling
- Contrastive Code Representation Learning Paras Jain et al.
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- Realtoxicityprompts: Evaluating Neural Toxic Degeneration In Language Models Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, Noah A. Smith
- Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training Weizhen Qi et al.
- The Cascade Transformer: An Application For Efficient Answer Sentence Selection Luca Soldaini, Alessandro Moschitti
- Machine Reading Comprehension: The Role Of Contextualized Language Models And Beyond Zhuosheng Zhang, Hai Zhao, Rui Wang
- Automated Source Code Generation And Auto-completion Using Deep Learning: Comparing And Discussing Current Language-model-related Approaches Juan Cruz-benito, Sanjay Vishwakarma, Francisco Martin-fernandez, Ismael Faro
- Logic-guided Data Augmentation And Regularization For Consistent Question Answering Akari Asai, Hannaneh Hajishirzi
- Encoding Syntactic Knowledge In Transformer Encoder For Intent Detection And Slot Filling Jixuan Wang, Kai Wei, Martin Radfar, Weiwei Zhang, Clement Chung
- Grounding Language To Autonomously-acquired Skills Via Goal Generation Ahmed Akakzia, Cédric Colas, Pierre-yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
- Look Before You Speak: Visually Contextualized Utterances Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid
- Retrieval-augmented Generation For Knowledge-intensive NLP Tasks Patrick Lewis et al.
- ETC: Encoding Long And Structured Inputs In Transformers Joshua Ainslie et al.
- Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue Shikib Mehri, Mihail Eric, Dilek Hakkani-tur
- A Survey Of Knowledge-enhanced Text Generation Wenhao Yu et al.
- Simplifying Paragraph-level Question Generation Via Transformer Language Models Luis Enrico Lopez, Diane Kathryn Cruz, Jan Christian Blaise Cruz, Charibeth Cheng
- Big Bird: Transformers For Longer Sequences Manzil Zaheer et al.
- Assessing Phrasal Representation And Composition In Transformers Lang Yu, Allyson Ettinger
- UBAR: Towards Fully End-to-end Task-oriented Dialog Systems With GPT-2 Yunyi Yang, Yunhao Li, Xiaojun Quan
- Narrative Interpolation For Generating And Understanding Stories Su Wang, Greg Durrett, Katrin Erk
- Dialoguetrm: Exploring The Intra- And Inter-modal Emotional Behaviors In The Conversation Yuzhao Mao et al.
- Pchatbot: A Large-scale Dataset For Personalized Chatbot Hongjin Qian et al.
- Language Models Are Few-shot Learners Tom B. Brown et al.
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- How Can We Know When Language Models Know? On The Calibration Of Language Models For Question Answering Zhengbao Jiang, Jun Araki, Haibo Ding, Graham Neubig
- Hard-coded Gaussian Attention For Neural Machine Translation Weiqiu You, Simeng Sun, Mohit Iyyer
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Cosda-ml: Multi-lingual Code-switching Data Augmentation For Zero-shot Cross-lingual NLP Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che
- Probing Pretrained Language Models For Lexical Semantics Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
- Human Instruction-following With Deep Reinforcement Learning Via Transfer-learning From Text Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley
- How Context Affects Language Models' Factual Predictions Fabio Petroni et al.
- How Fine Can Fine-tuning Be? Learning Efficient Language Models Evani Radiya-dixit, Xin Wang
- Calibration Of Pre-trained Transformers Shrey Desai, Greg Durrett
- Variational Transformers For Diverse Response Generation Zhaojiang Lin, Genta Indra Winata, Peng Xu, Zihan Liu, Pascale Fung
- Robust Encodings: A Framework For Combating Adversarial Typos Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang
- Turngpt: A Transformer-based Language Model For Predicting Turn-taking In Spoken Dialog Erik Ekstedt, Gabriel Skantze
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- A Transformer-based Approach For Source Code Summarization Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-wei Chang
- CERT: Contrastive Self-supervised Learning For Language Understanding Hongchao Fang, Sicheng Wang, Meng Zhou, Jiayuan Ding, Pengtao Xie
- XLM-T: Scaling Up Multilingual Machine Translation With Pretrained Cross-lingual Transformer Encoders Shuming Ma et al.
- A Controllable Model Of Grounded Response Generation Zeqiu Wu et al.
- On Learning Universal Representations Across Languages Xiangpeng Wei et al.
- Length-adaptive Transformer: Train Once With Length Drop, Use Anytime With Search Gyuwan Kim, Kyunghyun Cho
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Residual Energy-based Models For Text Generation Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'aurelio Ranzato
- Contrastive Triple Extraction With Generative Transformer Hongbin Ye et al.
- Aragpt2: Pre-trained Transformer For Arabic Language Generation Wissam Antoun, Fady Baly, Hazem Hajj
- Training Question Answering Models From Synthetic Data Raul Puri, Ryan Spring, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro
- Very Deep Transformers For Neural Machine Translation Xiaodong Liu, Kevin Duh, Liyuan Liu, Jianfeng Gao
- Mixup-transformer: Dynamic Data Augmentation For NLP Tasks Lichao Sun et al.
- Mobilebert: A Compact Task-agnostic BERT For Resource-limited Devices Zhiqing Sun et al.
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- Document Ranking With A Pretrained Sequence-to-sequence Model Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- DAVE: Deriving Automatically Verilog From English Hammond Pearce, Benjamin Tan, Ramesh Karri
- Indic-transformers: An Analysis Of Transformer Language Models For Indian Languages Kushal Jain, Adwait Deshpande, Kumar Shridhar, Felix Laumann, Ayushman Dash
- Plotmachines: Outline-conditioned Generation With Dynamic Plot State Tracking Hannah Rashkin, Asli Celikyilmaz, Yejin Choi, Jianfeng Gao
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- Emptransfo: A Multi-head Transformer Architecture For Creating Empathetic Dialog Systems Rohola Zandie, Mohammad H. Mahoor
- DSTC8-AVSD: Multimodal Semantic Transformer Network With Retrieval Style Word Generator Hwanhee Lee et al.
- Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Ruibo Liu et al.
- On The Effect Of Dropping Layers Of Pre-trained Transformer Models Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- HAT: Hardware-aware Transformers For Efficient Natural Language Processing Hanrui Wang et al.
- BERT Based Multilingual Machine Comprehension In English And Hindi Somil Gupta, Nilesh Khade
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- Can You Put It All Together: Evaluating Conversational Agents' Ability To Blend Skills Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-lan Boureau
- Controlling Style In Generated Dialogue Eric Michael Smith, Diana Gonzalez-rico, Emily Dinan, Y-lan Boureau
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- Delight: Deep And Light-weight Transformer Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
- Trojaning Language Models For Fun And Profit Xinyang Zhang, Zheng Zhang, Shouling Ji, Ting Wang
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- On Optimal Transformer Depth For Low-resource Language Translation Elan Van Biljon, Arnu Pretorius, Julia Kreutzer
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- Unsupervised Evaluation Of Interactive Dialog With Dialogpt Shikib Mehri, Maxine Eskenazi
- An Empirical Study On Robustness To Spurious Correlations Using Pre-trained Language Models Lifu Tu, Garima Lalwani, Spandana Gella, He He
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Mt5: A Massively Multilingual Pre-trained Text-to-text Transformer Linting Xue et al.
- Data Augmentation Using Pre-trained Transformer Models Varun Kumar, Ashutosh Choudhary, Eunah Cho
- The Pile: An 800GB Dataset Of Diverse Text For Language Modeling Leo Gao et al.
- BERT Loses Patience: Fast And Robust Inference With Early Exit Wangchunshu Zhou et al.
- Rethinking The Value Of Transformer Components Wenxuan Wang, Zhaopeng Tu
- As Good As New. How To Successfully Recycle English GPT-2 To Make Models For Other Languages Wietse De Vries, Malvina Nissim
- Funnel-transformer: Filtering Out Sequential Redundancy For Efficient Language Processing Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- Retrofitting Structure-aware Transformer Language Model For End Tasks Hao Fei, Yafeng Ren, Donghong Ji
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- Minilmv2: Multi-head Self-attention Relation Distillation For Compressing Pretrained Transformers Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei
- An Exploratory Study On Long Dialogue Summarization: What Works And What's Next Yusen Zhang et al.
- Few-shot Learning With Multilingual Language Models Xi Victoria Lin et al.
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- Bias Out-of-the-box: An Empirical Analysis Of Intersectional Occupational Biases In Popular Generative Language Models Hannah Kirk et al.
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- Truthfulqa: Measuring How Models Mimic Human Falsehoods Stephanie Lin, Jacob Hilton, Owain Evans
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- GPT Understands, Too Xiao Liu et al.
- Language Model As An Annotator: Exploring Dialogpt For Dialogue Summarization Xiachong Feng, Xiaocheng Feng, Libo Qin, Bing Qin, Ting Liu
- Thinking Aloud: Dynamic Context Generation Improves Zero-shot Reasoning Performance Of GPT-2 Gregor Betz, Kyle Richardson, Christian Voigt
- Contrastive Learning For Many-to-many Multilingual Neural Machine Translation Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
- G-transformer For Document-level Machine Translation Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo
- Unifying Multimodal Transformer For Bi-directional Image And Text Generation Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu
- Language Models Are Few-shot Multilingual Learners Genta Indra Winata et al.
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- Improving Stack Overflow Question Title Generation With Copying Enhanced Codebert Model And Bi-modal Information Fengji Zhang et al.
- Progressive Transformer-based Generation Of Radiology Reports Farhad Nooralahzadeh, Nicolas Perez Gonzalez, Thomas Frauenfelder, Koji Fujimoto, Michael Krauthammer
- Reframing Instructional Prompts To Gptk's Language Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
- Mention Memory: Incorporating Textual Knowledge Into Transformers Through Entity Mention Attention Michiel De Jong, Yury Zemlyanskiy, Nicholas Fitzgerald, Fei Sha, William Cohen
- Robeczech: Czech Roberta, A Monolingual Contextualized Language Representation Model Milan Straka, Jakub Náplava, Jana Straková, David Samuel
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- Evaluating The Robustness Of Neural Language Models To Input Perturbations Milad Moradi, Matthias Samwald
- Advancing High-resolution Video-language Representation With Large-scale Video Transcriptions Hongwei Xue et al.
- Emotion-aware Chat Machine: Automatic Emotional Response Generation For Human-like Emotional Interaction Wei Wei et al.
- Wangchanberta: Pretraining Transformer-based Thai Language Models Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai, Sarana Nutanong
- Generic Attention-model Explainability For Interpreting Bi-modal And Encoder-decoder Transformers Hila Chefer, Shir Gur, Lior Wolf
- Explaining Documents' Relevance To Search Queries Razieh Rahimi, Youngwoo Kim, Hamed Zamani, James Allan
- MWP-BERT: Numeracy-augmented Pre-training For Math Word Problem Solving Zhenwen Liang et al.
- Transformer-based Conditional Variational Autoencoder For Controllable Story Generation Le Fang et al.
- Bob: BERT Over BERT For Training Persona-based Dialogue Models From Limited Personalized Data Haoyu Song, Yan Wang, Kaiyan Zhang, Wei-nan Zhang, Ting Liu
- Learning Rich Representation Of Keyphrases From Text Mayank Kulkarni, Debanjan Mahata, Ravneet Arora, Rajarshi Bhowmik
- Personalized Transformer For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- BERT, Mbert, Or Bibert? A Study On Contextualized Embeddings For Neural Machine Translation Haoran Xu, Benjamin Van Durme, Kenton Murray
- Retrieval Augmentation Reduces Hallucination In Conversation Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, Jason Weston
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- Prompt Programming For Large Language Models: Beyond The Few-shot Paradigm Laria Reynolds, Kyle Mcdonell
- Less Is More: Pre-train A Strong Text Encoder For Dense Retrieval Using A Weak Decoder Shuqi Lu et al.
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- Focused Attention Improves Document-grounded Generation Shrimai Prabhumoye, Kazuma Hashimoto, Yingbo Zhou, Alan W Black, Ruslan Salakhutdinov
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Thank You BART! Rewarding Pre-trained Models Improves Formality Style Transfer Huiyuan Lai, Antonio Toral, Malvina Nissim
- Unsupervised Corpus Aware Language Model Pre-training For Dense Passage Retrieval Luyu Gao, Jamie Callan
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Entailment As Few-shot Learner Sinong Wang, Han Fang, Madian Khabsa, Hanzi Mao, Hao Ma
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Mitigating Political Bias In Language Models Through Reinforced Calibration Ruibo Liu et al.
- Process For Adapting Language Models To Society (PALMS) With Values-targeted Datasets Irene Openai Solaiman, Christy Openai Dennison
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Vision-and-language Or Vision-for-language? On Cross-modal Influence In Multimodal Transformers Stella Frank, Emanuele Bugliarello, Desmond Elliott
- Revisiting The Primacy Of English In Zero-shot Cross-lingual Transfer Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-wei Chang, Kristina Toutanova
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- KAT: A Knowledge Augmented Transformer For Vision-and-language Liangke Gui et al.
- Pretrained Transformers As Universal Computation Engines Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
- One Chatbot Per Person: Creating Personalized Chatbots Based On Implicit User Profiles Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, Ji-rong Wen
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- Using Prior Knowledge To Guide Bert's Attention In Semantic Textual Matching Tasks Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
- GPT-3 Models Are Poor Few-shot Learners In The Biomedical Domain Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald
- Conversational Question Answering Over Knowledge Graphs With Transformer And Graph Attention Networks Endri Kacupaj et al.
- Revealing Persona Biases In Dialogue Systems Emily Sheng, Josh Arnold, Zhou Yu, Kai-wei Chang, Nanyun Peng
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- All That's 'human' Is Not Gold: Evaluating Human Evaluation Of Generated Text Elizabeth Clark et al.
- How Should Pre-trained Language Models Be Fine-tuned Towards Adversarial Robustness? Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Arat5: Text-to-text Transformers For Arabic Language Generation El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-mageed
- Investigating The Limitations Of Transformers With Simple Arithmetic Tasks Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- Sequence Length Is A Domain: Length-based Overfitting In Transformer Models Dušan Variš, Ondřej Bojar
- Text Compression-aided Transformer Encoding Zuchao Li et al.
- Trankit: A Light-weight Transformer-based Toolkit For Multilingual Natural Language Processing Minh Van Nguyen, Viet Dac Lai, Amir Pouran Ben Veyseh, Thien Huu Nguyen
- Clip4caption: CLIP For Video Caption Mingkang Tang et al.
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Self-guided Contrastive Learning For BERT Sentence Representations Taeuk Kim, Kang Min Yoo, Sang-goo Lee
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- Compressing Visual-linguistic Model Via Knowledge Distillation Zhiyuan Fang et al.
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- The Impact Of Multiple Parallel Phrase Suggestions On Email Input And Composition Behaviour Of Native And Non-native English Writers Daniel Buschek, Martin Zürn, Malin Eiband
- Adaptive Semiparametric Language Models Dani Yogatama, Cyprien De Masson D'autume, Lingpeng Kong
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- Differentially Private Fine-tuning Of Language Models Da Yu et al.
- Automated Quality Assessment Of Cognitive Behavioral Therapy Sessions Through Highly Contextualized Language Representations Nikolaos Flemotomos et al.
- Long-span Summarization Via Local Attention And Content Selection Potsawee Manakul, Mark J. F. Gales
- The Stability-efficiency Dilemma: Investigating Sequence Length Warmup For Training GPT Models Conglong Li, Minjia Zhang, Yuxiong He
- Language Model Evaluation Beyond Perplexity Clara Meister, Ryan Cotterell
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- Multimodal Transformer With Variable-length Memory For Vision-and-language Navigation Chuang Lin et al.
- Fastformer: Additive Attention Can Be All You Need Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang, Xing Xie
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-image Pre-training Paradigm Yangguang Li et al.
- Dialogue State Tracking With A Language Model Using Schema-driven Prompting Chia-hsuan Lee, Hao Cheng, Mari Ostendorf
- Glam: Efficient Scaling Of Language Models With Mixture-of-experts Nan Du et al.
- A Token-level Reference-free Hallucination Detection Benchmark For Free-form Text Generation Tianyu Liu et al.
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- Prompting Visual-language Models For Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
- Terapipe: Token-level Pipeline Parallelism For Training Large-scale Language Models Zhuohan Li et al.
- N\"UWA: Visual Synthesis Pre-training For Neural Visual World Creation Chenfei Wu et al.
- RAFT: A Real-world Few-shot Text Classification Benchmark Neel Alex et al.
- Non-invasive Self-attention For Side Information Fusion In Sequential Recommendation Chang Liu et al.
- Fantastically Ordered Prompts And Where To Find Them: Overcoming Few-shot Prompt Order Sensitivity Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- Climatebert: A Pretrained Language Model For Climate-related Text Nicolas Webersinke, Mathias Kraus, Julia Anna Bingler, Markus Leippold
- The Power Of Scale For Parameter-efficient Prompt Tuning Brian Lester, Rami Al-rfou, Noah Constant
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- A Short Survey Of Pre-trained Language Models For Conversational AI-A Newage In NLP Munazza Zaib, Quan Z. Sheng, Wei Emma Zhang
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- Vision Guided Generative Pre-trained Language Models For Multimodal Abstractive Summarization Tiezheng Yu, Wenliang Dai, Zihan Liu, Pascale Fung
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- Scale Efficiently: Insights From Pre-training And Fine-tuning Transformers Yi Tay et al.
- Medically Aware GPT-3 As A Data Generator For Medical Dialogue Summarization Bharath Chintagunta, Namit Katariya, Xavier Amatriain, Anitha Kannan
- Are Pre-trained Convolutions Better Than Pre-trained Transformers? Yi Tay et al.
- Towards Few-shot Fact-checking Via Perplexity Nayeon Lee, Yejin Bang, Andrea Madotto, Madian Khabsa, Pascale Fung
- Maria: Spanish Language Models Asier Gutiérrez-fandiño et al.
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- Multilingual LAMA: Investigating Knowledge In Multilingual Pretrained Language Models Nora Kassner, Philipp Dufter, Hinrich Schütze
- Multilingual Language Models Predict Human Reading Behavior Nora Hollenstein, Federico Pirovano, Ce Zhang, Lena Jäger, Lisa Beinborn
- Hierarchical Task Learning From Language Instructions With Unified Transformers And Self-monitoring Yichi Zhang, Joyce Chai
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Human Parity On Commonsenseqa: Augmenting Self-attention With External Attention Yichong Xu et al.
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- Predicting The Performance Of Multilingual NLP Models Anirudh Srinivasan et al.
- What Do Pre-trained Code Models Know About Code? Anjan Karmakar, Romain Robbes
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- Wordcraft: A Human-ai Collaborative Editor For Story Writing Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- GLM: General Language Model Pretraining With Autoregressive Blank Infilling Zhengxiao Du et al.
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- Demix Layers: Disentangling Domains For Modular Language Modeling Suchin Gururangan, Mike Lewis, Ari Holtzman, Noah A. Smith, Luke Zettlemoyer
- Worst Of Both Worlds: Biases Compound In Pre-trained Vision-and-language Models Tejas Srinivasan, Yonatan Bisk
- Few-shot Bot: Prompt-based Learning For Dialogue Systems Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung
- General-purpose Question-answering With Macaw Oyvind Tafjord, Peter Clark
- An Empirical Study Of GPT-3 For Few-shot Knowledge-based VQA Zhengyuan Yang et al.
- Task-oriented Dialogue System As Natural Language Generation Weizhi Wang et al.
- Episodic Transformer For Vision-and-language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun
- Understanding The Capabilities, Limitations, And Societal Impact Of Large Language Models Alex Tamkin, Miles Brundage, Jack Clark, Deep Ganguli
- Dexperts: Decoding-time Controlled Text Generation With Experts And Anti-experts Alisa Liu et al.
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- KM-BART: Knowledge Enhanced Multimodal BART For Visual Commonsense Generation Yiran Xing et al.
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- MT6: Multilingual Pretrained Text-to-text Transformer With Translation Pairs Zewen Chi et al.
- Towards Retrieval-based Conversational Recommendation Ahtsham Manzoor, Dietmar Jannach
- Bertese: Learning To Speak To BERT Adi Haviv, Jonathan Berant, Amir Globerson
- Commitbert: Commit Message Generation Using Pre-trained Programming Language Model Tae-hwan Jung
- Quiz-style Question Generation For News Stories Adam D. Lelkes, Vinh Q. Tran, Cong Yu
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- Embodied BERT: A Transformer Model For Embodied, Language-guided Visual Task Completion Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme
- Visqa: X-raying Vision And Language Reasoning In Transformers Theo Jaunet et al.
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Unlocking Compositional Generalization In Pre-trained Models Using Intermediate Representations Jonathan Herzig et al.
- Cotext: Multi-task Learning With Code-text Transformer Long Phan et al.
- Open Domain Question Answering Over Tables Via Dense Retrieval Jonathan Herzig, Thomas Müller, Syrine Krichene, Julian Martin Eisenschlos
- Rome Was Built In 1776: A Case Study On Factual Correctness In Knowledge-grounded Response Generation Sashank Santhanam et al.
- Scifive: A Text-to-text Transformer Model For Biomedical Literature Long N. Phan et al.
- Beyond Goldfish Memory: Long-term Open-domain Conversation Jing Xu, Arthur Szlam, Jason Weston
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- Using Adversarial Attacks To Reveal The Statistical Bias In Machine Reading Comprehension Models Jieyu Lin, Jiajie Zou, Nai Ding
- Learned Token Pruning For Transformers Sehoon Kim et al.
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- Rethink Training Of BERT Rerankers In Multi-stage Retrieval Pipeline Luyu Gao, Zhuyun Dai, Jamie Callan
- I-BERT: Integer-only BERT Quantization Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
- Sentence-t5: Scalable Sentence Encoders From Pre-trained Text-to-text Models Jianmo Ni et al.
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Taming Sparsely Activated Transformer With Stochastic Experts Simiao Zuo et al.
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- Condenser: A Pre-training Architecture For Dense Retrieval Luyu Gao, Jamie Callan
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- Planning With Learned Entity Prompts For Abstractive Summarization Shashi Narayan et al.
- Codexglue: A Machine Learning Benchmark Dataset For Code Understanding And Generation Shuai Lu et al.
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- Pangu-\(α\): Large-scale Autoregressive Pretrained Chinese Language Models With Auto-parallel Computation Wei Zeng et al.
- What Makes Good In-context Examples For GPT-\(3\)? Jiachang Liu et al.
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Code Structure Guided Transformer For Source Code Summarization Shuzheng Gao et al.
- Recursively Summarizing Books With Human Feedback Jeff Wu et al.
- Evaluating Large Language Models Trained On Code Mark Chen et al.
- Finetuned Language Models Are Zero-shot Learners Jason Wei et al.
- Show Your Work: Scratchpads For Intermediate Computation With Language Models Maxwell Nye et al.
- Variational Information Bottleneck For Effective Low-resource Fine-tuning Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Unifying Vision-and-language Tasks Via Text Generation Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal
- Scaling Language Models: Methods, Analysis & Insights From Training Gopher Jack W. Rae et al.
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Longt5: Efficient Text-to-text Transformer For Long Sequences Mandy Guo et al.
- How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty In Text Generation Using RAVEN R. Thomas Mccoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz
- Webgpt: Browser-assisted Question-answering With Human Feedback Reiichiro Nakano et al.
- Redditbias: A Real-world Resource For Bias Evaluation And Debiasing Of Conversational Language Models Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- Hurdles To Progress In Long-form Question Answering Kalpesh Krishna, Aurko Roy, Mohit Iyyer
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Adapting Language Models For Zero-shot Learning By Meta-tuning On Dataset And Prompt Collections Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- Pretrained Language Models For Text Generation: A Survey Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-rong Wen
- Dialogue History Matters! Personalized Response Selectionin Multi-turn Retrieval-based Chatbots Juntao Li et al.
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Indonlg: Benchmark And Resources For Evaluating Indonesian Natural Language Generation Samuel Cahyawijaya et al.
- Byt5: Towards A Token-free Future With Pre-trained Byte-to-byte Models Linting Xue et al.
- Reframing Human-ai Collaboration For Generating Free-text Explanations Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- MATE: Multi-view Attention For Table Transformer Efficiency Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, William W. Cohen
- Robertuito: A Pre-trained Language Model For Social Media Text In Spanish Juan Manuel Pérez, Damián A. Furman, Laura Alonso Alemany, Franco Luque
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- A Comparative Study Of Transformer-based Language Models On Extractive Question Answering Kate Pearce, Tiffany Zhan, Aneesh Komanduri, Justin Zhan
- Training Verifiers To Solve Math Word Problems Karl Cobbe et al.
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Evaluating Mixed-initiative Conversational Search Systems Via User Simulation Ivan Sekulić, Mohammad Aliannejadi, Fabio Crestani
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- Revisiting Neural Scaling Laws In Language And Vision Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
- Reacc: A Retrieval-augmented Code Completion Framework Shuai Lu et al.
- Coderl: Mastering Code Generation Through Pretrained Models And Deep Reinforcement Learning Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- Vision-and-language Pretrained Models: A Survey Siqu Long, Feiqi Cao, Soyeon Caren Han, Haiqin Yang
- Rethinking The Role Of Demonstrations: What Makes In-context Learning Work? Sewon Min et al.
- A Survey On Retrieval-augmented Text Generation Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
- On The Effect Of Pretraining Corpora On In-context Learning By A Large-scale Language Model Seongjin Shin et al.
- CLIP-TD: CLIP Targeted Distillation For Vision-language Tasks Zhecan Wang et al.
- Demystifying Prompts In Language Models Via Perplexity Estimation Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer
- On The Paradox Of Learning To Reason From Data Honghua Zhang, Liunian Harold Li, Tao Meng, Kai-wei Chang, Guy Van Den Broeck
- Interleaving Retrieval With Chain-of-thought Reasoning For Knowledge-intensive Multi-step Questions Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- Enabling Multimodal Generation On CLIP Via Vision-language Knowledge Distillation Wenliang Dai et al.
- Murag: Multimodal Retrieval-augmented Generator For Open Question Answering Over Images And Text Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen
- EVA2.0: Investigating Open-domain Chinese Dialogue Systems With Large-scale Pre-training Yuxian Gu et al.
- Cogvideo: Large-scale Pretraining For Text-to-video Generation Via Transformers Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- A Length-extrapolatable Transformer Yutao Sun et al.
- Contrastive Learning With Bidirectional Transformers For Sequential Recommendation Hanwen Du et al.
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- Rethinking With Retrieval: Faithful Large Language Model Inference Hangfeng He, Hongming Zhang, Dan Roth
- Less Is More: Learning To Refine Dialogue History For Personalized Dialogue Generation Hanxun Zhong, Zhicheng Dou, Yutao Zhu, Hongjin Qian, Ji-rong Wen
- Pali: A Jointly-scaled Multilingual Language-image Model Xi Chen et al.
- The Unreliability Of Explanations In Few-shot Prompting For Textual Reasoning Xi Ye, Greg Durrett
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- LUT-GEMM: Quantized Matrix Multiplication Based On Luts For Efficient Inference In Large-scale Generative Language Models Gunho Park et al.
- Using Large Language Models To Simulate Multiple Humans And Replicate Human Subject Studies Gati Aher, Rosa I. Arriaga, Adam Tauman Kalai
- Data Augmentation For Intent Classification With Off-the-shelf Large Language Models Gaurav Sahu et al.
- Promptcap: Prompt-guided Task-aware Image Captioning Yushi Hu et al.
- Contrastive Decoding: Open-ended Text Generation As Optimization Xiang Lisa Li et al.
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored To Political Identity Gabriel Simmons
- Hitskt: A Hierarchical Transformer Model For Session-aware Knowledge Tracing Fucai Ke et al.
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Mass-editing Memory In A Transformer Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Chatgpt Makes Medicine Easy To Swallow: An Exploratory Case Study On Simplified Radiology Reports Katharina Jeblick et al.
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Minicons: Enabling Flexible Behavioral And Representational Analyses Of Transformer Language Models Kanishka Misra
- Teaching Models To Express Their Uncertainty In Words Stephanie Lin, Jacob Hilton, Owain Evans
- Alexatm 20B: Few-shot Learning Using A Large-scale Multilingual Seq2seq Model Saleh Soltan et al.
- VLC-BERT: Visual Question Answering With Contextualized Commonsense Knowledge Sahithya Ravi, Aditya Chinchure, Leonid Sigal, Renjie Liao, Vered Shwartz
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Deepspeed-moe: Advancing Mixture-of-experts Inference And Training To Power Next-generation AI Scale Samyam Rajbhandari et al.
- Leveraging Large Language Models For Multiple Choice Question Answering Joshua Robinson, Christopher Michael Rytting, David Wingate
- Training Compute-optimal Large Language Models Jordan Hoffmann et al.
- Do Language Models Plagiarize? Jooyoung Lee, Thai Le, Jinghui Chen, Dongwon Lee
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Towards Trustworthy Autograding Of Short, Multi-lingual, Multi-type Answers Johannes Schneider, Robin Richner, Micha Riser
- Can Large Language Models Truly Understand Prompts? A Case Study With Negated Prompts Joel Jang, Seonghyeon Ye, Minjoon Seo
- News Summarization And Evaluation In The Era Of GPT-3 Tanya Goyal, Junyi Jessy Li, Greg Durrett
- Do Large Language Models Know What Humans Know? Sean Trott, Cameron Jones, Tyler Chang, James Michaelov, Benjamin Bergen
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- Controllable Natural Language Generation With Contrastive Prefixes Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen
- Incorporating Domain Knowledge Through Task Augmentation For Front-end Javascript Code Generation Sijie Shen et al.
- RASAT: Integrating Relational Structures Into Pretrained Seq2seq Model For Text-to-sql Jiexing Qi et al.
- Unified-io: A Unified Model For Vision, Language, And Multi-modal Tasks Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
- Lilt: A Simple Yet Effective Language-independent Layout Transformer For Structured Document Understanding Jiapeng Wang, Lianwen Jin, Kai Ding
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Knowledge Prompting In Pre-trained Language Model For Natural Language Understanding Jianing Wang et al.
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- Improving The Domain Adaptation Of Retrieval Augmented Generation (RAG) Models For Open Domain Question Answering Shamane Siriwardhana et al.
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- Coca: Contrastive Captioners Are Image-text Foundation Models Jiahui Yu et al.
- Zerogen: Efficient Zero-shot Learning Via Dataset Generation Jiacheng Ye et al.
- Gtrans: Grouping And Fusing Transformer Layers For Neural Machine Translation Jian Yang et al.
- Adapting Pre-trained Language Models To African Languages Via Multilingual Adaptive Fine-tuning Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- BERTIN: Efficient Pre-training Of A Spanish Language Model Using Perplexity Sampling Javier De La Rosa et al.
- Chain-of-thought Prompting Elicits Reasoning In Large Language Models Jason Wei et al.
- Visconde: Multi-document QA With GPT-3 And Neural Reranking Jayr Pereira, Robson Fidalgo, Roberto Lotufo, Rodrigo Nogueira
- Dall-eval: Probing The Reasoning Skills And Social Biases Of Text-to-image Generation Models Jaemin Cho, Abhay Zala, Mohit Bansal
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- Action-gpt: Leveraging Large-scale Language Models For Improved And Generalized Action Generation Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla
- Neural Theory-of-mind? On The Limits Of Social Intelligence In Large Lms Maarten Sap, Ronan Lebras, Daniel Fried, Yejin Choi
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Vit5: Pretrained Text-to-text Transformer For Vietnamese Language Generation Long Phan, Hieu Tran, Hieu Nguyen, Trieu H. Trinh
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- Efficient Few-shot Learning Without Prompts Lewis Tunstall et al.
- Phenaki: Variable Length Video Generation From Open Domain Textual Description Ruben Villegas et al.
- Personalized Prompt Learning For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Data Distributional Properties Drive Emergent In-context Learning In Transformers Stephanie C. Y. Chan et al.
- Galactica: A Large Language Model For Science Ross Taylor et al.
- Language Models That Seek For Knowledge: Modular Search & Generation For Dialogue And Prompt Completion Kurt Shuster et al.
- Blenderbot 3: A Deployed Conversational Agent That Continually Learns To Responsibly Engage Kurt Shuster et al.
- Distilling Reasoning Capabilities Into Smaller Language Models Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- Who Is GPT-3? An Exploration Of Personality, Values And Demographics Marilù Miotto, Nicola Rossberg, Bennett Kleinberg
- Efficient Long-text Understanding With Short-text Models Maor Ivgi, Uri Shaham, Jonathan Berant
- Language Models Are Realistic Tabular Data Generators Vadim Borisov, Kathrin Seßler, Tobias Leemann, Martin Pawelczyk, Gjergji Kasneci
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- OPT: Open Pre-trained Transformer Language Models Susan Zhang et al.
- Can Large Language Models Reason About Medical Questions? Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther
- Visual Prompt Tuning Menglin Jia et al.
- Structured Pruning Learns Compact And Accurate Models Mengzhou Xia, Zexuan Zhong, Danqi Chen
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Can Machines Help Us Answering Question 16 In Datasheets, And In Turn Reflecting On Inappropriate Content? Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
- A Systematic Evaluation Of Large Language Models Of Code Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn
- Mixgen: A New Multi-modal Data Augmentation Xiaoshuai Hao et al.
- Pangu-coder: Program Synthesis With Function-level Language Modeling Fenia Christopoulou et al.
- Vindlu: A Recipe For Effective Video-and-language Pretraining Feng Cheng et al.
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Ernie-search: Bridging Cross-encoder With Dual-encoder Via Self On-the-fly Distillation For Dense Passage Retrieval Yuxiang Lu et al.
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Towards Unified Conversational Recommender Systems Via Knowledge-enhanced Prompt Learning Xiaolei Wang, Kun Zhou, Ji-rong Wen, Wayne Xin Zhao
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- Compilable Neural Code Generation With Compiler Feedback Xin Wang et al.
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- CREPE: Can Vision-language Foundation Models Reason Compositionally? Zixian Ma et al.
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- Hyperprompt: Prompt-based Task-conditioning Of Transformers Yun He et al.
- Improving Passage Retrieval With Zero-shot Question Generation Devendra Singh Sachan et al.
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Least-to-most Prompting Enables Complex Reasoning In Large Language Models Denny Zhou et al.
- Protoclip: Prototypical Contrastive Language Image Pretraining Delong Chen et al.
- Block-recurrent Transformers Delesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur
- Future Transformer For Long-term Action Anticipation Dayoung Gong, Joonseok Lee, Manjin Kim, Seong Jong Ha, Minsu Cho
- Adaprompt: Adaptive Model Training For Prompt-based NLP Yulong Chen et al.
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- Democratizing Contrastive Language-image Pre-training: A CLIP Benchmark Of Data, Model, And Supervision Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao
- Putting Gpt-3's Creativity To The (alternative Uses) Test Claire Stevenson, Iris Smal, Matthijs Baas, Raoul Grasman, Han Van Der Maas
- Competition-level Code Generation With Alphacode Yujia Li et al.
- LAION-5B: An Open Large-scale Dataset For Training Next Generation Image-text Models Christoph Schuhmann et al.
- Fast Inference From Transformers Via Speculative Decoding Yaniv Leviathan, Matan Kalman, Yossi Matias
- Mplug: Effective And Efficient Vision-language Learning By Cross-modal Skip-connections Chenliang Li et al.
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- Complexity-based Prompting For Multi-step Reasoning Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, Tushar Khot
- Exploring Length Generalization In Large Language Models Cem Anil et al.
- In-context Learning And Induction Heads Catherine Olsson et al.
- A Survey On Model Compression And Acceleration For Pretrained Language Models Canwen Xu, Julian Mcauley
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Why Does Surprisal From Larger Transformer-based Language Models Provide A Poorer Fit To Human Reading Times? Byung-doh Oh, William Schuler
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- Is GPT-3 A Good Data Annotator? Bosheng Ding et al.
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- Survey Of Hallucination In Natural Language Generation Ziwei Ji et al.
- Super-naturalinstructions: Generalization Via Declarative Instructions On 1600+ NLP Tasks Yizhong Wang et al.
- Impact Of Pretraining Term Frequencies On Few-shot Reasoning Yasaman Razeghi, Robert L. Iv Logan, Matt Gardner, Sameer Singh
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- Analogy Generation By Prompting Large Language Models: A Case Study Of Instructgpt Bhavya Bhavya, Jinjun Xiong, Chengxiang Zhai
- BLOOM: A 176b-parameter Open-access Multilingual Language Model Bigscience Workshop et al.
- Memorizing Transformers Yuhuai Wu, Markus N. Rabe, Delesley Hutchins, Christian Szegedy
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- Attributed Question Answering: Evaluation And Modeling For Attributed Large Language Models Bernd Bohnet et al.
- St-moe: Designing Stable And Transferable Sparse Expert Models Barret Zoph et al.
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- Large Language Models Are Better Reasoners With Self-verification Yixuan Weng et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Retrieval Augmentation Of Large Language Models For Lay Language Generation Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- Recurrent Memory Transformer Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
- Automatic Chain Of Thought Prompting In Large Language Models Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola
- T-NER: An All-round Python Library For Transformer-based Named Entity Recognition Asahi Ushio, Jose Camacho-collados
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Grips: Gradient-free, Edit-based Instruction Search For Prompting Large Language Models Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Making Large Language Models Better Reasoners With Step-aware Verifier Yifei Li et al.
- Clinical-longformer And Clinical-bigbird: Transformers For Long Clinical Sequences Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- The AI Teacher Test: Measuring The Pedagogical Ability Of Blender And GPT-3 In Educational Dialogues Anaïs Tack, Chris Piech
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Prompt-to-prompt Image Editing With Cross Attention Control Amir Hertz et al.
- Language Model Compression With Weighted Low-rank Factorization Yen-chang Hsu et al.
- Language Models Of Code Are Few-shot Commonsense Learners Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig
- Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier
- Text And Patterns: For Effective Chain Of Thought, It Takes Two To Tango Aman Madaan, Amir Yazdanbakhsh
- Commonsenseqa 2.0: Exposing The Limits Of AI Through Gamification Alon Talmor et al.
- Memory-assisted Prompt Editing To Improve GPT-3 After Deployment Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- A Model-agnostic Data Manipulation Method For Persona-based Dialogue Generation Yu Cao, Wei Bi, Meng Fang, Shuming Shi, Dacheng Tao
- Dual Modality Prompt Tuning For Vision-language Pre-trained Model Yinghui Xing et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- WANLI: Worker And AI Collaboration For Natural Language Inference Dataset Creation Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- Empowering Language Models With Knowledge Graph Reasoning For Question Answering Ziniu Hu et al.
- ATTEMPT: Parameter-efficient Multi-task Tuning Via Attentional Mixtures Of Soft Prompts Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- Storydall-e: Adapting Pretrained Text-to-image Transformers For Story Continuation Adyasha Maharana, Darryl Hannan, Mohit Bansal
- Transformer Language Models Without Positional Encodings Still Learn Positional Information Adi Haviv, Ori Ram, Ofir Press, Peter Izsak, Omer Levy
- A New Path: Scaling Vision-and-language Navigation With Synthetic Instructions And Imitation Learning Aishwarya Kamath et al.
- Language Models Are Greedy Reasoners: A Systematic Formal Analysis Of Chain-of-thought Abulhair Saparov, He He
- Scaling Up Models And Data With \(\texttt{t5x}\) And \(\texttt{seqio}\) Adam Roberts et al.
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models Aarohi Shammie Srivastava et al.
- TALM: Tool Augmented Language Models Aaron Parisi, Yao Zhao, Noah Fiedel
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- Dynamic Prompt Learning Via Policy Gradient For Semi-structured Mathematical Reasoning Pan Lu et al.
- Large Language Models And The Reverse Turing Test Terrence Sejnowski
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- Generative Spoken Dialogue Language Modeling Tu Anh Nguyen et al.
- Measuring And Narrowing The Compositionality Gap In Language Models Ofir Press et al.
- Emergent Analogical Reasoning In Large Language Models Taylor Webb, Keith J. Holyoak, Hongjing Lu
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- "this Is My Unicorn, Fluffy": Personalizing Frozen Vision-language Representations Niv Cohen, Rinon Gal, Eli A. Meirom, Gal Chechik, Yuval Atzmon
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- Learning To Compose Soft Prompts For Compositional Zero-shot Learning Nihal V. Nayak, Peilin Yu, Stephen H. Bach
- Unifiedskg: Unifying And Multi-tasking Structured Knowledge Grounding With Text-to-text Language Models Tianbao Xie et al.
- Thinking Fast And Slow In Large Language Models Thilo Hagendorff, Sarah Fabi, Michal Kosinski
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Arabart: A Pretrained Arabic Sequence-to-sequence Model For Abstractive Summarization Moussa Kamal Eddine, Nadi Tomeh, Nizar Habash, Joseph Le Roux, Michalis Vazirgiannis
- Transformer Feed-forward Layers Build Predictions By Promoting Concepts In The Vocabulary Space Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
- Dylora: Parameter Efficient Tuning Of Pre-trained Models Using Dynamic Search-free Low-rank Adaptation Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, Ali Ghodsi
- Llm.int8(): 8-bit Matrix Multiplication For Transformers At Scale Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- Evaluating Human-language Model Interaction Mina Lee et al.
- An Empirical Study Of End-to-end Video-language Transformers With Masked Visual Modeling Tsu-jui Fu et al.
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Training And Evaluating A Jupyter Notebook Data Science Assistant Shubham Chandel, Colin B. Clement, Guillermo Serrato, Neel Sundaresan
- Coauthor: Designing A Human-ai Collaborative Writing Dataset For Exploring Language Model Capabilities Mina Lee, Percy Liang, Qian Yang
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- Confident Adaptive Language Modeling Tal Schuster et al.
- Make-a-video: Text-to-video Generation Without Text-video Data Uriel Singer et al.
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- Meta Policy Learning For Cold-start Conversational Recommendation Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu
- 3DALL-E: Integrating Text-to-image AI In 3D Design Workflows Vivian Liu, Jo Vermeulen, George Fitzmaurice, Justin Matejka
- Help Me Write A Poem: Instruction Tuning As A Vehicle For Collaborative Poetry Writing Tuhin Chakrabarty, Vishakh Padmakumar, He He
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- CTRAN: Cnn-transformer-based Network For Natural Language Understanding Mehrdad Rafiepour, Javad Salimi Sartakhti
- Can Llms Express Their Uncertainty? An Empirical Evaluation Of Confidence Elicitation In Llms Miao Xiong et al.
- Do Large Language Models Resemble Humans In Language Use? Zhenguang G. Cai, Xufeng Duan, David A. Haslett, Shuqi Wang, Martin J. Pickering
- From Image To Language: A Critical Analysis Of Visual Question Answering (VQA) Approaches, Challenges, And Opportunities Md Farhan Ishmam, Md Sakib Hossain Shovon, M. F. Mridha, Nilanjan Dey
- A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets Md Tahmid Rahman Laskar et al.
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Gptaraeval: A Comprehensive Evaluation Of Chatgpt On Arabic NLP Md Tawkat Islam Khondaker, Abdul Waheed, El Moatez Billah Nagoudi, Muhammad Abdul-mageed
- Unleashing The Emergent Cognitive Synergy In Large Language Models: A Task-solving Agent Through Multi-persona Self-collaboration Zhenhailong Wang et al.
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Co-writing With Opinionated Language Models Affects Users' Views Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, Mor Naaman
- Voicebox: Text-guided Multilingual Universal Speech Generation At Scale Matthew Le et al.
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- Large Language Models Effectively Leverage Document-level Context For Literary Translation, But Critical Errors Persist Marzena Karpinska, Mohit Iyyer
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- Applenet: Visual Attention Parameterized Prompt Learning For Few-shot Remote Sensing Image Generalization Using CLIP Mainak Singha, Ankit Jha, Bhupendra Solanki, Shirsha Bose, Biplab Banerjee
- Natural Language Generation And Understanding Of Big Code For Ai-assisted Programming: A Review Man Fai Wong, Shangxin Guo, Ching Nam Hang, Siu Wai Ho, Chee Wei Tan
- The Reversal Curse: Llms Trained On "A Is B" Fail To Learn "B Is A" Lukas Berglund et al.
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Generative Artificial Intelligence In Learning Analytics: Contextualising Opportunities And Challenges Through The Learning Analytics Cycle Lixiang Yan, Roberto Martinez-maldonado, Dragan Gašević
- Practical And Ethical Challenges Of Large Language Models In Education: A Systematic Scoping Review Lixiang Yan et al.
- Comparing Sentence-level Suggestions To Message-level Suggestions In Ai-mediated Communication Liye Fu, Benjamin Newman, Maurice Jakesch, Sarah Kreps
- Give Us The Facts: Enhancing Large Language Models With Knowledge Graphs For Fact-aware Language Modeling Linyao Yang, Hongyang Chen, Zhao Li, Xiao Ding, Xindong Wu
- From Word Models To World Models: Translating From Natural Language To The Probabilistic Language Of Thought Lionel Wong et al.
- Human-ai Collaboration In Thematic Analysis Using Chatgpt: A User Study And Design Recommendations Lixiang Yan et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- Judging Llm-as-a-judge With Mt-bench And Chatbot Arena Lianmin Zheng et al.
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Can Chatgpt Replace Stackoverflow? A Study On Robustness And Reliability Of Large Language Model Code Generation Li Zhong, Zilong Wang
- Deep Learning Mental Health Dialogue System Lennart Brocki, George C. Dyer, Anna Gładka, Neo Christopher Chung
- Zero-shot Next-item Recommendation Using Large Pretrained Language Models Lei Wang, Ee-peng Lim
- Surgicalgpt: End-to-end Language-vision GPT For Visual Question Answering In Surgery Lalithkumar Seenivasan, Mobarakol Islam, Gokul Kannan, Hongliang Ren
- Superclue: A Comprehensive Chinese Large Language Model Benchmark Liang Xu et al.
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- Just Tell Me: Prompt Engineering In Business Process Management Kiran Busch, Alexander Rochlitzer, Diana Sola, Henrik Leopold
- 14 Examples Of How Llms Can Transform Materials Science And Chemistry: A Reflection On A Large Language Model Hackathon Kevin Maik Jablonka et al.
- News Verifiers Showdown: A Comparative Performance Evaluation Of Chatgpt 3.5, Chatgpt 4.0, Bing AI, And Bard In News Fact-checking Kevin Matthe Caramancion
- Inference-time Intervention: Eliciting Truthful Answers From A Language Model Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Speak, Memory: An Archaeology Of Books Known To Chatgpt/gpt-4 Kent K. Chang, Mackenzie Cramer, Sandeep Soni, David Bamman
- Just Ask For Calibration: Strategies For Eliciting Calibrated Confidence Scores From Language Models Fine-tuned With Human Feedback Katherine Tian et al.
- Evaluating Language Models For Mathematics Through Interactions Katherine M. Collins et al.
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Waffling Around For Performance: Visual Classification With Random Words And Broad Concepts Karsten Roth et al.
- Chipgpt: How Far Are We From Natural Language Hardware Design Kaiyan Chang et al.
- The Imitation Game: Detecting Human And Ai-generated Texts In The Era Of Chatgpt And BARD Kadhim Hayawi, Sakib Shahriar, Sujith Samuel Mathew
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- Evaluation And Analysis Of Hallucination In Large Vision-language Models Junyang Wang et al.
- Writer-defined AI Personas For On-demand Feedback Generation Karim Benharrak, Tim Zindulka, Florian Lehmann, Hendrik Heuer, Daniel Buschek
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Recommendation As Instruction Following: A Large Language Model Empowered Recommendation Approach Junjie Zhang et al.
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- Evaluating GPT-4 And Chatgpt On Japanese Medical Licensing Examinations Jungo Kasai, Yuhei Kasai, Keisuke Sakaguchi, Yutaro Yamada, Dragomir Radev
- Chatcounselor: A Large Language Models For Mental Health Support June M. Liu et al.
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Spear Phishing With Large Language Models Julian Hazell
- Jatmo: Prompt Injection Defense By Task-specific Finetuning Julien Piet et al.
- GQA: Training Generalized Multi-query Transformer Models From Multi-head Checkpoints Joshua Ainslie et al.
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- MEGA: Multilingual Evaluation Of Generative AI Kabir Ahuja et al.
- Phoenix: Democratizing Chatgpt Across Languages Zhihong Chen et al.
- Towards Llm-based Autograding For Short Textual Answers Johannes Schneider, Bernd Schenk, Christina Niklaus
- LEXTREME: A Multi-lingual And Multi-task Benchmark For The Legal Domain Joel Niklaus et al.
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- Gptscore: Evaluate As You Desire Jinlan Fu, See-kiong Ng, Zhengbao Jiang, Pengfei Liu
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- Structgpt: A General Framework For Large Language Model To Reason Over Structured Data Jinhao Jiang et al.
- Graphix-t5: Mixing Pre-trained Transformers With Graph-aware Layers For Text-to-sql Parsing Jinyang Li et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- Longnet: Scaling Transformers To 1,000,000,000 Tokens Jiayu Ding et al.
- Geotechnical Parrot Tales (GPT): Harnessing Large Language Models In Geotechnical Engineering Krishna Kumar
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Unified-io 2: Scaling Autoregressive Multimodal Models With Vision, Language, Audio, And Action Jiasen Lu et al.
- Set-of-mark Prompting Unleashes Extraordinary Visual Grounding In GPT-4V Jianwei Yang et al.
- Ethical Chatgpt: Concerns, Challenges, And Commandments Jianlong Zhou, Heimo Müller, Andreas Holzinger, Fang Chen
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- The Impact Of Chatgpt And Llms On Medical Imaging Stakeholders: Perspectives And Use Cases Jiancheng Yang, Hongwei Bran Li, Donglai Wei
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- Imagebind-llm: Multi-modality Instruction Tuning Jiaming Han et al.
- On Decoder-only Architecture For Speech-to-text And Large Language Model Integration Jian Wu et al.
- Onellm: One Framework To Align All Modalities With Language Jiaming Han et al.
- ICL-D3IE: In-context Learning With Diverse Demonstrations Updating For Document Information Extraction Jiabang He et al.
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- LLM Lies: Hallucinations Are Not Bugs, But Features As Adversarial Examples Jia-yu Yao et al.
- Unlearn What You Want To Forget: Efficient Unlearning For Llms Jiaao Chen, Diyi Yang
- Ureader: Universal Ocr-free Visually-situated Language Understanding With Multimodal Large Language Model Jiabo Ye et al.
- GPT-3.5, GPT-4, Or BARD? Evaluating Llms Reasoning Ability In Zero-shot Setting And Performance Boosting Through Prompts Jessica López Espejel, El Hassane Ettifouri, Mahaman Sanoussi Yahaya Alassan, El Mehdi Chouham, Walid Dahhane
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Leveraging Large Language Models For Sequential Recommendation Jesse Harte et al.
- Larger Language Models Do In-context Learning Differently Jerry Wei et al.
- Artificial Muses: Generative Artificial Intelligence Chatbots Have Risen To Human-level Creativity Jennifer Haase, Paul H. P. Hanel
- Evaluating Large Language Models On A Highly-specialized Topic, Radiation Oncology Physics Jason Holmes et al.
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- Chatgpt To Replace Crowdsourcing Of Paraphrases For Intent Classification: Higher Diversity And Comparable Model Robustness Jan Cegin, Jakub Simko, Peter Brusilovsky
- Thrilled By Your Progress! Large Language Models (GPT-4) No Longer Struggle To Pass Assessments In Higher Education Programming Courses Jaromir Savelka, Arav Agarwal, Marshall An, Chris Bogart, Majd Sakr
- A Comparative Study Of Ai-generated (GPT-4) And Human-crafted Mcqs In Programming Education Jacob Doughty et al.
- Simple And Controllable Music Generation Jade Copet et al.
- Chip-chat: Challenges And Opportunities In Conversational Hardware Design Jason Blocklove, Siddharth Garg, Ramesh Karri, Hammond Pearce
- Evaluation Of Chatgpt On Biomedical Tasks: A Zero-shot Comparison With Fine-tuned Generative Transformers Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- Chatgpt In The Classroom: An Analysis Of Its Strengths And Weaknesses For Solving Undergraduate Computer Science Questions Ishika Joshi et al.
- More Robots Are Coming: Large Multimodal Models (chatgpt) Can Solve Visually Diverse Images Of Parsons Problems Irene Hou et al.
- Factuality Challenges In The Era Of Large Language Models Isabelle Augenstein et al.
- "it's Not Like Jarvis, But It's Pretty Close!" -- Examining Chatgpt's Usage Among Undergraduate Students In Computer Science Ishika Joshi, Ritvik Budhiraja, Harshal D Akolekar, Jagat Sesh Challa, Dhruv Kumar
- The Curse Of Recursion: Training On Generated Data Makes Models Forget Ilia Shumailov et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Muse: Text-to-image Generation Via Masked Generative Transformers Huiwen Chang et al.
- Llama: Open And Efficient Foundation Language Models Hugo Touvron et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- Chatgpt Chemistry Assistant For Text Mining And Prediction Of MOF Synthesis Zhiling Zheng, Oufan Zhang, Christian Borgs, Jennifer T. Chayes, Omar M. Yaghi
- "it's A Fair Game", Or Is It? Examining How Users Navigate Disclosure Risks And Benefits When Using Llm-based Conversational Agents Zhiping Zhang et al.
- Building Cooperative Embodied Agents Modularly With Large Language Models Hongxin Zhang et al.
- Fingpt: Open-source Financial Large Language Models Hongyang Yang, Xiao-yang Liu, Christina Dan Wang
- Doctorglm: Fine-tuning Your Chinese Doctor Is Not A Herculean Task Honglin Xiong et al.
- Cognitive Mirage: A Review Of Hallucinations In Large Language Models Hongbin Ye, Tong Liu, Aijia Zhang, Wei Hua, Weiqiang Jia
- Semantic Compression With Large Language Models Henry Gilbert, Michael Sandborn, Douglas C. Schmidt, Jesse Spencer-smith, Jules White
- Bioinstruct: Instruction Tuning Of Large Language Models For Biomedical Natural Language Processing Hieu Tran, Zhichao Yang, Zonghai Yao, Hong Yu
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- Boosting Theory-of-mind Performance In Large Language Models Via Prompting Shima Rahimi Moghaddam, Christopher J. Honey
- Mathprompter: Mathematical Reasoning Using Large Language Models Shima Imani, Liang Du, Harsh Shrivastava
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Toolkengpt: Augmenting Frozen Language Models With Massive Tools Via Tool Embeddings Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu
- Gorilla: Large Language Model Connected With Massive Apis Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez
- Mixture-of-experts Meets Instruction Tuning:a Winning Combination For Large Language Models Sheng Shen et al.
- Why Does Chatgpt Fall Short In Providing Truthful Answers? Shen Zheng, Jie Huang, Kevin Chen-chuan Chang
- Recommender Systems With Generative Retrieval Shashank Rajput et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Verigen: A Large Language Model For Verilog Code Generation Shailja Thakur et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- Factscore: Fine-grained Atomic Evaluation Of Factual Precision In Long Form Text Generation Sewon Min et al.
- On Codex Prompt Engineering For OCL Generation: An Empirical Study Seif Abukhalaf, Mohammad Hamdaqa, Foutse Khomh
- The Moral Authority Of Chatgpt Sebastian Krügel, Andreas Ostermaier, Matthias Uhl
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- GPT-RE: In-context Learning For Relation Extraction Using Large Language Models Zhen Wan et al.
- A Comparative Study Of Open-source Large Language Models, GPT-4 And Claude 2: Multiple-choice Test Taking In Nephrology Sean Wu et al.
- Generating Phishing Attacks Using Chatgpt Sayak Saha Roy, Krishna Vamsi Naragam, Shirin Nilizadeh
- Medalign: A Clinician-generated Dataset For Instruction Following With Electronic Medical Records Scott L. Fleming et al.
- Chatgpt Or Human? Detect And Explain. Explaining Decisions Of Machine Learning Model For Detecting Short Chatgpt-generated Text Sandra Mitrović, Davide Andreoletti, Omran Ayoub
- Let's Have A Chat! A Conversation With Chatgpt: Technology, Applications, And Limitations Sakib Shahriar, Kadhim Hayawi
- Ai-assisted Coding: Experiments With GPT-4 Russell A Poldrack, Thomas Lu, Gašper Beguš
- Verify-and-edit: A Knowledge-enhanced Chain-of-thought Framework Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, Lidong Bing
- Are Emergent Abilities Of Large Language Models A Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo
- Does Synthetic Data Generation Of Llms Help Clinical Text Mining? Ruixiang Tang, Xiaotian Han, Xiaoqian Jiang, Xia Hu
- Chatgpt Vs. Google: A Comparative Study Of Search Performance And User Experience Ruiyun Rayna Xu, Yue Katherine Feng, Hailiang Chen
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Prompting For Multimodal Hateful Meme Classification Rui Cao, Roy Ka-wei Lee, Wen-haw Chong, Jing Jiang
- Gpteval: A Survey On Assessments Of Chatgpt And GPT-4 Rui Mao, Guanyi Chen, Xulang Zhang, Frank Guerin, Erik Cambria
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Audiogpt: Understanding And Generating Speech, Music, Sound, And Talking Head Rongjie Huang et al.
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- Palm 2 Technical Report Rohan Anil et al.
- In-context Learning Creates Task Vectors Roee Hendel, Mor Geva, Amir Globerson
- Chatgpt Is Not All You Need. A State Of The Art Review Of Large Generative AI Models Roberto Gozalo-brizuela, Eduardo C. Garrido-merchan
- Llm-assisted Content Analysis: Using Large Language Models To Support Deductive Coding Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim
- Retrieval-augmented Image Captioning Rita Ramos, Desmond Elliott, Bruno Martins
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Chatgpt As A Factual Inconsistency Evaluator For Text Summarization Zheheng Luo, Qianqian Xie, Sophia Ananiadou
- How Secure Is Code Generated By Chatgpt? Raphaël Khoury, Anderson R. Avila, Jacob Brunelle, Baba Mamadou Camara
- Sabi\'a: Portuguese Large Language Models Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira
- Large Language Models Predict Human Sensory Judgments Across Six Modalities Raja Marjieh, Ilia Sucholutsky, Pol Van Rijn, Nori Jacoby, Thomas L. Griffiths
- Can We Trust The Evaluation On Chatgpt? Rachith Aiyappa, Jisun An, Haewoon Kwak, Yong-yeol Ahn
- Embers Of Autoregression: Understanding Large Language Models Through The Problem They Are Trained To Solve R. Thomas Mccoy, Shunyu Yao, Dan Friedman, Matthew Hardy, Thomas L. Griffiths
- Lawyer Llama Technical Report Quzhe Huang et al.
- Evaluation Of Chatgpt-generated Medical Responses: A Systematic Review And Meta-analysis Qiuhong Wei et al.
- Can Large Language Models Replace Humans In The Systematic Review Process? Evaluating Gpt-4's Efficacy In Screening And Extracting Data From Peer-reviewed And Grey Literature In Multiple Languages Qusai Khraisha, Sophie Put, Johanna Kappenberg, Azza Warraitch, Kristin Hadfield
- Translating Radiology Reports Into Plain Language Using Chatgpt And GPT-4 With Prompt Learning: Promising Results, Limitations, And Potential Qing Lyu et al.
- Faithful Chain-of-thought Reasoning Qing Lyu et al.
- Grounded Text-to-image Synthesis With Attention Refocusing Quynh Phung, Songwei Ge, Jia-bin Huang
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Genegpt: Augmenting Large Language Models With Domain Tools For Improved Access To Biomedical Information Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu
- Medcpt: Contrastive Pre-trained Transformers With Large-scale Pubmed Search Logs For Zero-shot Biomedical Information Retrieval Qiao Jin et al.
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Selfcheckgpt: Zero-resource Black-box Hallucination Detection For Generative Large Language Models Potsawee Manakul, Adian Liusie, Mark J. F. Gales
- Students' Perceptions And Preferences Of Generative Artificial Intelligence Feedback For Programming Zhengdong Zhang et al.
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Git-mol: A Multi-modal Large Language Model For Molecular Science With Graph, Image, And Text Pengfei Liu, Yiming Ren, Jun Tao, Zhixiang Ren
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- GPT Has Become Financially Literate: Insights From Financial Literacy Tests Of GPT And A Preliminary Test Of How People Use It As A Source Of Advice Paweł Niszczota, Sami Abbas
- Audiopalm: A Large Language Model That Can Speak And Listen Paul K. Rubenstein et al.
- Graphologue: Exploring Large Language Model Responses With Interactive Diagrams Peiling Jiang, Jude Rayan, Steven P. Dow, Haijun Xia
- Starcoder: May The Source Be With You! Raymond Li et al.
- VISAR: A Human-ai Argumentative Writing Assistant With Visual Programming And Rapid Draft Prototyping Zheng Zhang, Jie Gao, Ranjodh Singh Dhaliwal, Toby Jia-jun Li
- Internlm-xcomposer: A Vision-language Large Model For Advanced Text-image Comprehension And Composition Pan Zhang et al.
- In-context Retrieval-augmented Language Models Ori Ram et al.
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- GPT-4 Technical Report Openai et al.
- Chameleon: Plug-and-play Compositional Reasoning With Large Language Models Pan Lu et al.
- Ontochatgpt Information System: Ontology-driven Structured Prompts For Chatgpt Meta-learning Oleksandr Palagin, Vladislav Kaverinskiy, Anna Litvin, Kyrylo Malakhov
- Hallucinations In Large Multilingual Translation Models Nuno M. Guerreiro et al.
- Large Language Models Are Built-in Autoregressive Search Engines Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
- Faith And Fate: Limits Of Transformers On Compositionality Nouha Dziri et al.
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- Chatgpt Is A Knowledgeable But Inexperienced Solver: An Investigation Of Commonsense Problem In Large Language Models Ning Bian et al.
- CAT-LM: Training Language Models On Aligned Code And Tests Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn
- Automated Annotation With Generative AI Requires Validation Nicholas Pangakis, Samuel Wolken, Neil Fasching
- Sources Of Hallucination By Large Language Models On Inference Tasks Nick Mckenna et al.
- Self-contradictory Hallucinations Of Large Language Models: Evaluation, Detection And Mitigation Niels Mündler, Jingxuan He, Slobodan Jenko, Martin Vechev
- Jais And Jais-chat: Arabic-centric Foundation And Instruction-tuned Open Generative Large Language Models Neha Sengupta et al.
- A Stitch In Time Saves Nine: Detecting And Mitigating Hallucinations Of Llms By Validating Low-confidence Generation Neeraj Varshney, Wenlin Yao, Hongming Zhang, Jianshu Chen, Dong Yu
- Large Language Models Are Zero-shot Time Series Forecasters Nate Gruver, Marc Finzi, Shikai Qiu, Andrew Gordon Wilson
- Exploring The Potential Of Large Language Models To Generate Formative Programming Feedback Natalie Kiesler, Dominic Lohr, Hieke Keuning
- Chatgpt MT: Competitive For High- (but Not Low-) Resource Languages Nathaniel R. Robinson, Perez Ogayo, David R. Mortensen, Graham Neubig
- Consistency Analysis Of Chatgpt Myeongjun Erik Jang, Thomas Lukasiewicz
- Label Supervised Llama Finetuning Zongxi Li et al.
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- A Review Of Chatgpt Applications In Education, Marketing, Software Engineering, And Healthcare: Benefits, Drawbacks, And Research Directions Mohammad Fraiwan, Natheer Khasawneh
- Open Sesame! Universal Black Box Jailbreaking Of Large Language Models Raz Lapid, Ron Langberg, Moshe Sipper
- Api-bank: A Comprehensive Benchmark For Tool-augmented Llms Minghao Li et al.
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Scalable Extraction Of Training Data From (production) Language Models Milad Nasr et al.
- Evaluating Large Language Models In Theory Of Mind Tasks Michal Kosinski
- Detecting Llm-generated Text In Computing Education: A Comparative Study For Chatgpt Cases Michael Sheinman Orenstrakh, Oscar Karnalim, Carlos Anibal Suarez, Michael Liut
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Hyena Hierarchy: Towards Larger Convolutional Language Models Michael Poli et al.
- Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- Large Language Models Are Effective Text Rankers With Pairwise Ranking Prompting Zhen Qin et al.
- Empirical Study Of Zero-shot NER With Chatgpt Tingyu Xie et al.
- Large Language Models Are State-of-the-art Evaluators Of Translation Quality Tom Kocmi, Christian Federmann
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Medalpaca -- An Open-source Collection Of Medical Conversational AI Models And Training Data Tianyu Han et al.
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Generalized Planning In PDDL Domains With Pretrained Large Language Models Tom Silver et al.
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Large Language Model Alignment: A Survey Tianhao Shen et al.
- Diagnostic Reasoning Prompts Reveal The Potential For Large Language Model Interpretability In Medicine Thomas Savage, Ashwin Nayak, Robert Gallo, Ekanath Rangan, Jonathan H Chen
- Grounding Large Language Models In Interactive Environments With Online Reinforcement Learning Thomas Carta et al.
- Cognitive Architectures For Language Agents Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths
- Red Teaming Chatgpt Via Jailbreaking: Bias, Robustness, Reliability And Toxicity Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing
- Deception Abilities Emerged In Large Language Models Thilo Hagendorff
- Hallusionbench: An Advanced Diagnostic Suite For Entangled Language Hallucination And Visual Illusion In Large Vision-language Models Tianrui Guan et al.
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Is Chatgpt A Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation Tao Fang et al.
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Chatgpt: Beginning Of An End Of Manual Linguistic Data Annotation? Use Case Of Automatic Genre Identification Taja Kuzman, Igor Mozetič, Nikola Ljubešić
- Sparks Of Artificial General Intelligence: Early Experiments With GPT-4 Sébastien Bubeck et al.
- Observations On Llms For Telecom Domain: Capabilities And Limitations Sumit Soman, Ranjani H G
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- Textbooks Are All You Need Suriya Gunasekar et al.
- Orca: Progressive Learning From Complex Explanation Traces Of GPT-4 Subhabrata Mukherjee et al.
- Transformative Effects Of Chatgpt On Modern Education: Emerging Era Of AI Chatbots Sukhpal Singh Gill et al.
- Analyzing The Performance Of GPT-3.5 And GPT-4 In Grammatical Error Correction Steven Coyne, Keisuke Sakaguchi, Diana Galvan-sosa, Michael Zock, Kentaro Inui
- AI, Write An Essay For Me: A Large-scale Comparison Of Human-written Versus Chatgpt-generated Essays Steffen Herbold, Annette Hautli-janisz, Ute Heuer, Zlata Kikteva, Alexander Trautsch
- Chatgpt Perpetuates Gender Bias In Machine Translation And Ignores Non-gendered Pronouns: Findings Across Bengali And Five Other Low-resource Languages Sourojit Ghosh, Aylin Caliskan
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- Expressive Text-to-image Generation With Rich Text Songwei Ge, Taesung Park, Jun-yan Zhu, Jia-bin Huang
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Principled Instructions Are All You Need For Questioning Llama-1/2, GPT-3.5/4 Sondos Mahmoud Bsharat, Aidar Myrzakhan, Zhiqiang Shen
- Llm-empowered Chatbots For Psychiatrist And Patient Simulation: Application And Evaluation Siyuan Chen et al.
- Metagpt: Meta Programming For A Multi-agent Collaborative Framework Sirui Hong et al.
- On The Possibilities Of Ai-generated Text Detection Souradip Chakraborty et al.
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- Wikichat: Stopping The Hallucination Of Large Language Model Chatbots By Few-shot Grounding On Wikipedia Sina J. Semnani, Violet Z. Yao, Heidi C. Zhang, Monica S. Lam
- Mind Meets Machine: Unravelling Gpt-4's Cognitive Psychology Sifatkaur Dhingra, Manmeet Singh, Vaisakh Sb, Neetiraj Malviya, Sukhpal Singh Gill
- Can A Student Large Language Model Perform As Well As It's Teacher? Sia Gholami, Marwan Omar
- Mariogpt: Open-ended Text2level Generation Through Large Language Models Shyam Sudhakaran et al.
- Tree Of Thoughts: Deliberate Problem Solving With Large Language Models Shunyu Yao et al.
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- From Words To Watts: Benchmarking The Energy Costs Of Large Language Model Inference Siddharth Samsi et al.
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Memorybank: Enhancing Large Language Models With Long-term Memory Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, Yanlin Wang
- Inpars-v2: Large Language Models As Efficient Dataset Generators For Information Retrieval Vitor Jeronymo et al.
- Chatgpt Beyond English: Towards A Comprehensive Evaluation Of Large Language Models In Multilingual Learning Viet Dac Lai et al.
- Is GPT-4 A Reliable Rater? Evaluating Consistency In GPT-4 Text Ratings Veronika Hackl, Alexandra Elena Müller, Michael Granitzer, Maximilian Sailer
- LIDA: A Tool For Automatic Generation Of Grammar-agnostic Visualizations And Infographics Using Large Language Models Victor Dibia
- Automated Reading Passage Generation With Openai's Large Language Model Ummugul Bezirhan, Matthias Von Davier
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Automating Human Tutor-style Programming Feedback: Leveraging GPT-4 Tutor Model For Hint Generation And GPT-3.5 Student Model For Hint Validation Tung Phung et al.
- Generative AI For Programming Education: Benchmarking Chatgpt, GPT-4, And Human Tutors Tung Phung et al.
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Automatic Semantic Augmentation Of Language Model Prompts (for Code Summarization) Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
- Chinese Intermediate English Learners Outdid Chatgpt In Deep Cohesion: Evidence From English Narrative Writing Tongquan Zhou, Siyi Cao, Siruo Zhou, Yao Zhang, Aijing He
- Trusting Your Evidence: Hallucinate Less With Context-aware Decoding Weijia Shi et al.
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- A Preliminary Evaluation Of Chatgpt For Zero-shot Dialogue Understanding Wenbo Pan, Qiguang Chen, Xiao Xu, Wanxiang Che, Libo Qin
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- Can Large Language Models Provide Useful Feedback On Research Papers? A Large-scale Empirical Analysis Weixin Liang et al.
- Is Chatgpt Equipped With Emotional Dialogue Capabilities? Weixiang Zhao et al.
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Layoutgpt: Compositional Visual Planning And Generation With Large Language Models Weixi Feng et al.
- Chatgraph: Interpretable Text Classification By Converting Chatgpt Knowledge To Graphs Yucheng Shi et al.
- R2gengpt: Radiology Report Generation With Frozen Llms Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou
- Chatgpt For PLC/DCS Control Logic Generation Heiko Koziolek, Sten Gruener, Virendra Ashiwal
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine Harsha Nori et al.
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Extractive Summarization Via Chatgpt For Faithful Summary Generation Haopeng Zhang, Xiao Liu, Jiawei Zhang
- Chatgpt Or Grammarly? Evaluating Chatgpt On Grammatical Error Correction Benchmark Haoran Wu, Wenxuan Wang, Yuxuan Wan, Wenxiang Jiao, Michael Lyu
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Q-instruct: Improving Low-level Visual Abilities For Multi-modality Foundation Models Haoning Wu et al.
- Autodroid: Llm-powered Task Automation In Android Hao Wen et al.
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Reasoning Implicit Sentiment With Chain-of-thought Prompting Hao Fei et al.
- Visual Instruction Tuning Haotian Liu, Chunyuan Li, Qingyang Wu, Yong Jae Lee
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Personallm: Investigating The Ability Of Large Language Models To Express Personality Traits Hang Jiang et al.
- Choice Over Control: How Users Write With Large Language Models Using Diegetic And Non-diegetic Prompting Hai Dang, Sven Goller, Florian Lehmann, Daniel Buschek
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Wizardmath: Empowering Mathematical Reasoning For Large Language Models Via Reinforced Evol-instruct Haipeng Luo et al.
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Revisiting Large Language Models As Zero-shot Relation Extractors Guozheng Li, Peng Wang, Wenjun Ke
- Auggpt: Leveraging Chatgpt For Text Data Augmentation Haixing Dai et al.
- Exploring The Psychology Of Llms' Moral And Legal Reasoning Guilherme F. C. F. Almeida, José Luiz Nunes, Neele Engelmann, Alex Wiegmann, Marcelo De Araújo
- Chatgpt Hallucinates When Attributing Answers Guido Zuccon, Bevan Koopman, Razia Shaik
- Perspectives On Large Language Models For Relevance Judgment Guglielmo Faggioli et al.
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Dr Chatgpt, Tell Me What I Want To Hear: How Prompt Knowledge Impacts Health Answer Correctness Guido Zuccon, Bevan Koopman
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Performance Of The Pre-trained Large Language Model GPT-4 On Automated Short Answer Grading Gerd Kortemeyer
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Lawbench: Benchmarking Legal Knowledge Of Large Language Models Zhiwei Fei et al.
- Navgpt: Explicit Reasoning In Vision-and-language Navigation With Large Language Models Gengze Zhou, Yicong Hong, Qi Wu
- Text Matching Improves Sequential Recommendation By Reducing Popularity Biases Zhenghao Liu et al.
- Do Large Language Models Show Decision Heuristics Similar To Humans? A Case Study Using GPT-3.5 Gaurav Suri, Lily R. Slater, Ali Ziaee, Morgan Nguyen
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Lost In Translation: Large Language Models In Non-english Content Analysis Gabriel Nicholas, Aliya Bhatia
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Exploring Human-like Translation Strategy With Large Language Models Zhiwei He et al.
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Empower Large Language Model To Perform Better On Industrial Domain-specific Question Answering Fangkai Yang et al.
- Is Chatgpt Better Than Human Annotators? Potential And Limitations Of Chatgpt In Explaining Implicit Hate Speech Fan Huang, Haewoon Kwak, Jisun An
- Chatgpt Outperforms Crowd-workers For Text-annotation Tasks Fabrizio Gilardi, Meysam Alizadeh, Maël Kubli
- Learning To Reason Over Scene Graphs: A Case Study Of Finetuning GPT-2 Into A Robot Language Model For Grounded Task Planning Georgia Chalvatzaki et al.
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Learning To Prompt In The Classroom To Understand AI Limits: A Pilot Study Emily Theophilou et al.
- Towards Efficient Fine-tuning Of Pre-trained Code Models: An Experimental Study And Beyond Ensheng Shi et al.
- Moviechat: From Dense Token To Sparse Memory For Long Video Understanding Enxin Song et al.
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Aligning Large Multimodal Models With Factually Augmented RLHF Zhiqing Sun et al.
- Llm-adapters: An Adapter Family For Parameter-efficient Fine-tuning Of Large Language Models Zhiqiang Hu et al.
- Simulating H.P. Lovecraft Horror Literature With The Chatgpt Large Language Model Eduardo C. Garrido-merchán, José Luis Arroyo-barrigüete, Roberto Gozalo-brizuela
- Gptutor: A Chatgpt-powered Programming Tool For Code Explanation Eason Chen, Ray Huang, Han-shin Chen, Yuen-hsien Tseng, Liang-yi Li
- Vipergpt: Visual Inference Via Python Execution For Reasoning Dídac Surís, Sachit Menon, Carl Vondrick
- The Falcon Series Of Open Language Models Ebtesam Almazrouei et al.
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- Speechgpt: Empowering Large Language Models With Intrinsic Cross-modal Conversational Abilities Dong Zhang et al.
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- MELTR: Meta Loss Transformer For Learning To Fine-tune Video Foundation Models Dohwan Ko et al.
- Evaluating Open-domain Question Answering In The Era Of Large Language Models Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- Chatgpt Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions Deyao Zhu et al.
- Evaluating GPT-3.5 And GPT-4 Models On Brazilian University Admission Exams Desnes Nunes, Ricardo Primi, Ramon Pires, Roberto Lotufo, Rodrigo Nogueira
- Fine-tuning Chatgpt For Automatic Scoring Ehsan Latif, Xiaoming Zhai
- Using An LLM To Help With Code Understanding Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, Brad Myers
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- CORE-GPT: Combining Open Access Research And Large Language Models For Credible, Trustworthy Question Answering David Pride, Matteo Cancellieri, Petr Knoth
- Response: Emergent Analogical Reasoning In Large Language Models Damian Hodel, Jevin West
- Have Llms Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models Daman Arora, Himanshu Gaurav Singh, Mausam
- Improving Accuracy Of GPT-3/4 Results On Biomedical Data Using A Retrieval-augmented Language Model David Soong et al.
- AI And The FCI: Can Chatgpt Project An Understanding Of Introductory Physics? Colin G. West
- Weak-to-strong Generalization: Eliciting Strong Capabilities With Weak Supervision Collin Burns et al.
- Llava-med: Training A Large Language-and-vision Assistant For Biomedicine In One Day Chunyuan Li et al.
- LIMA: Less Is More For Alignment Chunting Zhou et al.
- Chatgpt Evaluation On Sentence Level Relations: A Focus On Temporal, Causal, And Discourse Relations Chunkit Chan et al.
- Conversational Automated Program Repair Chunqiu Steven Xia, Lingming Zhang
- Progressive-hint Prompting Improves Reasoning In Large Language Models Chuanyang Zheng, Zhengying Liu, Enze Xie, Zhenguo Li, Yu Li
- Drivelm: Driving With Graph Visual Question Answering Chonghao Sima et al.
- A Study On The Implementation Of Generative AI Services Using An Enterprise Data-based LLM Application Architecture Cheonsu Jeong
- Distilled GPT For Source Code Summarization Chia-yi Su, Collin Mcmillan
- Llm-powered Data Augmentation For Enhanced Cross-lingual Performance Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- Is Chatgpt A General-purpose Natural Language Processing Task Solver? Chengwei Qin et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- Memgpt: Towards Llms As Operating Systems Charles Packer et al.
- Supporting Human-ai Collaboration In Auditing Llms With Llms Charvi Rastogi, Marco Tulio Ribeiro, Nicholas King, Harsha Nori, Saleema Amershi
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- Hallucination Augmented Contrastive Learning For Multimodal Large Language Model Chaoya Jiang et al.
- Wizardlm: Empowering Large Language Models To Follow Complex Instructions Can Xu et al.
- Does GPT-4 Pass The Turing Test? Cameron R. Jones, Benjamin K. Bergen
- Pmc-llama: Towards Building Open-source Language Models For Medicine Chaoyi Wu et al.
- Chatgpt And A New Academic Reality: Artificial Intelligence-written Research Papers And The Ethics Of The Large Language Models In Scholarly Publishing Brady Lund et al.
- Large Language Models On Graphs: A Comprehensive Survey Bowen Jin et al.
- Prompting Or Fine-tuning? A Comparative Study Of Large Language Models For Taxonomy Construction Boqi Chen, Fandi Yi, Dániel Varró
- MIMIC-IT: Multi-modal In-context Instruction Tuning Bo Li et al.
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- Seed-bench-2: Benchmarking Multimodal Large Language Models Bohao Li et al.
- Video-llava: Learning United Visual Representation By Alignment Before Projection Bin Lin et al.
- Evaluation Of Chatgpt For Nlp-based Mental Health Applications Bishal Lamichhane
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Large Language Models In The Workplace: A Case Study On Prompt Engineering For Job Type Classification Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, Thomas Brightwell
- Bad Actor, Good Advisor: Exploring The Role Of Large Language Models In Fake News Detection Beizhe Hu et al.
- Friend Or Foe? Exploring The Implications Of Large Language Models On The Science System Benedikt Fecher, Marcel Hebing, Melissa Laufer, Jörg Pohle, Fabian Sofsky
- Check Your Facts And Try Again: Improving Large Language Models With External Knowledge And Automated Feedback Baolin Peng et al.
- Instruction Tuning With GPT-4 Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- How Close Is Chatgpt To Human Experts? Comparison Corpus, Evaluation, And Detection Biyang Guo et al.
- A Study Of Generative Large Language Model For Medical Research And Healthcare Cheng Peng et al.
- Coupling Large Language Models With Logic Programming For Robust And General Reasoning From Text Zhun Yang, Adam Ishay, Joohyung Lee
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- Refactoring Programs Using Large Language Models With Few-shot Examples Atsushi Shirafuji, Yusuke Oda, Jun Suzuki, Makoto Morishita, Yutaka Watanobe
- The False Promise Of Imitating Proprietary Llms Arnav Gudibande et al.
- Exploring The Responses Of Large Language Models To Beginner Programmers' Help Requests Arto Hellas et al.
- Scaling Transformer To 1M Tokens And Beyond With RMT Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, Mikhail S. Burtsev
- Better Zero-shot Reasoning With Role-play Prompting Aobo Kong et al.
- Chatgpt: Applications, Opportunities, And Threats Aram Bahrini et al.
- Med-halt: Medical Domain Hallucination Test For Large Language Models Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Universal And Transferable Adversarial Attacks On Aligned Language Models Andy Zou et al.
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- Chemcrow: Augmenting Large-language Models With Chemistry Tools Andres M Bran et al.
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Generative AI: Implications And Applications For Education Anastasia Olnancy Olga et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- How Good Are GPT Models At Machine Translation? A Comprehensive Evaluation Amr Hendy et al.
- Chatgpt Is A Remarkable Tool -- For Experts Amos Azaria, Rina Azoulay, Shulamit Reches
- Fighting Fire With Fire: Can Chatgpt Detect Ai-generated Text? Amrita Bhattacharjee, Huan Liu
- The Impact Of Positional Encoding On Length Generalization In Transformers Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
- Toxicity In Chatgpt: Analyzing Persona-assigned Language Models Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan
- Self-refine: Iterative Refinement With Self-feedback Aman Madaan et al.
- Multilingual Machine Translation With Large Language Models: Empirical Results And Analysis Wenhao Zhu et al.
- Large Language Models For Telecom: Forthcoming Impact On The Industry Ali Maatouk, Nicola Piovesan, Fadhel Ayed, Antonio De Domenico, Merouane Debbah
- A Categorical Archive Of Chatgpt Failures Ali Borji
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Smoothllm: Defending Large Language Models Against Jailbreaking Attacks Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas
- Poisoning Language Models During Instruction Tuning Alexander Wan, Eric Wallace, Sheng Shen, Dan Klein
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Chatgpt: More Than A Weapon Of Mass Deception, Ethical Challenges And Responses From The Human-centered Artificial Intelligence (HCAI) Perspective Alejo Jose G. Sison, Marco Tulio Daza, Roberto Gozalo-brizuela, Eduardo C. Garrido-merchán
- What Does CLIP Know About A Red Circle? Visual Prompt Engineering For Vlms Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
- Self-rag: Learning To Retrieve, Generate, And Critique Through Self-reflection Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- Mistral 7B Albert Q. Jiang et al.
- Can Chatgpt And Bard Generate Aligned Assessment Items? A Reliability Analysis Against Human Performance Abdolvahab Khademi
- Conversational Ai-powered Design: Chatgpt As Designer, User, And Product A. Baki Kocaballi
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- Enhancing Retrieval-augmented Large Language Models With Iterative Retrieval-generation Synergy Zhihong Shao et al.
- MM-REACT: Prompting Chatgpt For Multimodal Reasoning And Action Zhengyuan Yang et al.
- Translating Natural Language To Planning Goals With Large-language Models Yaqi Xie et al.
- RTLLM: An Open-source Benchmark For Design RTL Generation With Large Language Model Yao Lu, Shang Liu, Qijun Zhang, Zhiyao Xie
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- Flexgen: High-throughput Generative Inference Of Large Language Models With A Single GPU Ying Sheng et al.
- Can Chatgpt Reproduce Human-generated Labels? A Study Of Social Computing Tasks Yiming Zhu, Peixian Zhang, Ehsan-ul Haq, Pan Hui, Gareth Tyson
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- Can Chatgpt Replace Traditional KBQA Models? An In-depth Analysis Of The Question Answering Performance Of The GPT LLM Family Yiming Tan et al.
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- A Comparative Study Of Pretrained Language Models For Long Clinical Text Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- A Comprehensive Survey Of Ai-generated Content (AIGC): A History Of Generative AI From GAN To Chatgpt Yihan Cao et al.
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- "kelly Is A Warm Person, Joseph Is A Role Model": Gender Biases In Llm-generated Reference Letters Yixin Wan et al.
- Jailbreaking Chatgpt Via Prompt Engineering: An Empirical Study Yi Liu et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Hugginggpt: Solving AI Tasks With Chatgpt And Its Friends In Hugging Face Yongliang Shen et al.
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- Assessing Cross-cultural Alignment Between Chatgpt And Human Societies: An Empirical Study Yong Cao et al.
- How Far Can Camels Go? Exploring The State Of Instruction Tuning On Open Resources Yizhong Wang et al.
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- Analyzing And Mitigating Object Hallucination In Large Vision-language Models Yiyang Zhou et al.
- Pandagpt: One Model To Instruction-follow Them All Yixuan Su et al.
- Key-locked Rank One Editing For Text-to-image Personalization Yoad Tewel, Rinon Gal, Gal Chechik, Yuval Atzmon
- Low-rank Adaptation Of Large Language Model Rescoring For Parameter-efficient Speech Recognition Yu Yu et al.
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- The Dark Side Of Chatgpt: Legal And Ethical Challenges From Stochastic Parrots And Hallucination Zihao Li
- Llavar: Enhanced Visual Instruction Tuning For Text-rich Image Understanding Yanzhe Zhang et al.
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Bubogpt: Enabling Visual Grounding In Multi-modal Llms Yang Zhao et al.
- Specializing Smaller Language Models Towards Multi-step Reasoning Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, Tushar Khot
- G-eval: NLG Evaluation Using GPT-4 With Better Human Alignment Yang Liu et al.
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Emotional Intelligence Of Large Language Models Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Liu Jia
- Classeval: A Manually-crafted Benchmark For Evaluating Llms On Class-level Code Generation Xueying Du et al.
- Improving Language Model Negotiation With Self-play And In-context Learning From AI Feedback Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata
- Can Chatgpt Pass The Vietnamese National High School Graduation Examination? Xuan-quy Dao, Ngoc-bich Le, Xuan-dung Phan, Bac-bien Ngo
- Performance Comparison Of Large Language Models On VNHSGE English Dataset: Openai Chatgpt, Microsoft Bing Chat, And Google Bard Xuan-quy Dao
- In Chatgpt We Trust? Measuring And Characterizing The Reliability Of Chatgpt Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- How Robust Is GPT-3.5 To Predecessors? A Comprehensive Study On Language Understanding Tasks Xuanting Chen et al.
- Wavcaps: A Chatgpt-assisted Weakly-labelled Audio Captioning Dataset For Audio-language Multimodal Research Xinhao Mei et al.
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- How To Unleash The Power Of Large Language Models For Few-shot Relation Extraction? Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
- Rethinking The Evaluation For Conversational Recommendation In The Era Of Large Language Models Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-rong Wen
- Xuanyuan 2.0: A Large Chinese Financial Chat Model With Hundreds Of Billions Parameters Xuanyu Zhang, Qing Yang, Dongliang Xu
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Deceptive AI Ecosystems: The Case Of Chatgpt Xiao Zhan, Yifan Xu, Stefan Sarkadi
- HPC-GPT: Integrating Large Language Model For High-performance Computing Xianzhong Ding et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Don't Trust Chatgpt When Your Question Is Not In English: A Study Of Multilingual Abilities And Types Of Llms Xiang Zhang, Senyu Li, Bradley Hauer, Ning Shi, Grzegorz Kondrak
- MMMU: A Massive Multi-discipline Multimodal Understanding And Reasoning Benchmark For Expert AGI Xiang Yue et al.
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Pali-3 Vision Language Models: Smaller, Faster, Stronger Xi Chen et al.
- The Unreasonable Effectiveness Of Few-shot Learning For Machine Translation Xavier Garcia et al.
- Cogagent: A Visual Language Model For GUI Agents Wenyi Hong et al.
- M3exam: A Multilingual, Multimodal, Multilevel Benchmark For Examining Large Language Models Wenxuan Zhang, Sharifah Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing
- Is Chatgpt A Good Translator? Yes With GPT-4 As The Engine Wenxiang Jiao et al.
- Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- Longbench: A Bilingual, Multitask Benchmark For Long Context Understanding Yushi Bai et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Chatdoctor: A Medical Chat Model Fine-tuned On A Large Language Model Meta-ai (llama) Using Medical Domain Knowledge Yunxiang Li et al.
- Large Language Models Are Zero-shot Rankers For Recommender Systems Yupeng Hou et al.
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Character-llm: A Trainable Agent For Role-playing Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu
- Exploring The Impact Of Instruction Data Scaling On Large Language Models: An Empirical Study On Real-world Use Cases Yunjie Ji et al.
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- Textbooks Are All You Need II: Phi-1.5 Technical Report Yuanzhi Li et al.
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- MEDITRON-70B: Scaling Medical Pretraining For Large Language Models Zeming Chen et al.
- Is Chatgpt A Good Sentiment Analyzer? A Preliminary Study Zengzhi Wang et al.
- C-eval: A Multi-level Multi-discipline Chinese Evaluation Suite For Foundation Models Yuzhen Huang et al.
- Learning Gain Differences Between Chatgpt And Human Tutor Generated Algebra Hints Zachary A. Pardos, Shreya Bhandari
- Let The Llms Talk: Simulating Human-to-human Conversational QA Via Zero-shot Llm-to-llm Interactions Zahra Abbasiantaeb, Yifei Yuan, Evangelos Kanoulas, Mohammad Aliannejadi
- Monitoring Ai-modified Content At Scale: A Case Study On The Impact Of Chatgpt On AI Conference Peer Reviews Weixin Liang et al.
- Earthgpt: A Universal Multi-modal Large Language Model For Multi-sensor Image Comprehension In Remote Sensing Domain Wei Zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Mao
- Assessing AI Detectors In Identifying Ai-generated Code: Implications For Education Wei Hung Pan et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Transformers Are Ssms: Generalized Models And Efficient Algorithms Through Structured State Space Duality Tri Dao, Albert Gu
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- Chatgpt As Research Scientist: Probing Gpt's Capabilities As A Research Librarian, Research Ethicist, Data Generator And Data Predictor Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R. Banaji
- The Era Of 1-bit Llms: All Large Language Models Are In 1.58 Bits Shuming Ma et al.
- Eyes Wide Shut? Exploring The Visual Shortcomings Of Multimodal Llms Shengbang Tong et al.
- Beyond Code Generation: An Observational Study Of Chatgpt Usage In Software Engineering Practice Ranim Khojah, Mazen Mohamad, Philipp Leitner, Francisco Gomes De Oliveira Neto
- Hidden Flaws Behind Expert-level Accuracy Of Multimodal GPT-4 Vision In Medicine Qiao Jin et al.
- Me Llama: Foundation Large Language Models For Medical Applications Qianqian Xie et al.
- From Text To Transformation: A Comprehensive Review Of Large Language Models' Versatility Pravneet Kaur et al.
- Large Language Model Capabilities In Perioperative Risk Prediction And Prognostication Philip Chung et al.
- SNIFFER: Multimodal Large Language Model For Explainable Out-of-context Misinformation Detection Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee
- Jamba: A Hybrid Transformer-mamba Language Model Opher Lieber et al.
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- History Of Generative Artificial Intelligence (AI) Chatbots: Past, Present, And Future Development Md. Al-amin et al.
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- Xlstm: Extended Long Short-term Memory Maximilian Beck et al.
- Large Legal Fictions: Profiling Legal Hallucinations In Large Language Models Matthew Dahl, Varun Magesh, Mirac Suzgun, Daniel E. Ho
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Language Models For Code Completion: A Practical Evaluation Maliheh Izadi et al.
- Linrec: Linear Attention Mechanism For Long-term Sequential Recommender Systems Langming Liu et al.
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Data Is All You Need: Finetuning Llms For Chip Design Via An Automated Design-data Augmentation Framework Kaiyan Chang et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Clochat: Understanding How People Customize, Interact, And Experience Personas In Large Language Models Juhye Ha, Hyeon Jeon, Daeun Han, Jinwook Seo, Changhoon Oh
- Feedback-generation For Programming Exercises With GPT-4 Imen Azaiz, Natalie Kiesler, Sven Strickroth
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Benchmarking Retrieval-augmented Generation For Medicine Guangzhi Xiong, Qiao Jin, Zhiyong Lu, Aidong Zhang
- Closing The Gap Between Open-source And Commercial Large Language Models For Medical Evidence Summarization Gongbo Zhang et al.
- Building Better AI Agents: A Provocation On The Utilisation Of Persona In Llm-based Conversational Agents Guangzhi Sun, Xiao Zhan, Jose Such
- Gemma 2: Improving Open Language Models At A Practical Size Gemma Team et al.
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context Gemini Team et al.
- Code-aware Prompting: A Study Of Coverage Guided Test Generation In Regression Setting Using LLM Gabriel Ryan et al.
- The Power Of Noise: Redefining Retrieval For RAG Systems Florin Cuconasu et al.
- Ai-tutoring In Software Engineering Education Eduard Frankford, Clemens Sauerwein, Patrick Bassner, Stephan Krusche, Ruth Breu
- Olmo: Accelerating The Science Of Language Models Dirk Groeneveld et al.
- Chemllm: A Chemical Large Language Model Di Zhang et al.
- Deepseek-v2: A Strong, Economical, And Efficient Mixture-of-experts Language Model Deepseek-ai et al.
- Deepseek-coder: When The Large Language Model Meets Programming -- The Rise Of Code Intelligence Daya Guo et al.
- Generative AI In EU Law: Liability, Privacy, Intellectual Property, And Cybersecurity Claudio Novelli, Federico Casolari, Philipp Hacker, Giorgio Spedicato, Luciano Floridi
- Open Source Language Models Can Provide Feedback: Evaluating Llms' Ability To Help Students Using Gpt-4-as-a-judge Charles Koutcheme et al.
- MM1: Methods, Analysis & Insights From Multimodal LLM Pre-training Brandon Mckinzie et al.
- Moe-llava: Mixture Of Experts For Large Vision-language Models Bin Lin et al.
- Homogenization Effects Of Large Language Models On Human Creative Ideation Barrett R. Anderson, Jash Hemant Shah, Max Kreminski
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- Gemini Goes To Med School: Exploring The Capabilities Of Multimodal Large Language Models On Medical Challenge Problems & Hallucinations Ankit Pal, Malaikannan Sankarasubbu
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- Why And When Llm-based Assistants Can Go Wrong: Investigating The Effectiveness Of Prompt-based Interactions For Software Help-seeking Anjali Khurana, Hari Subramonyam, Parmit K Chilana
- AI And Memory Wall Amir Gholami et al.
- Financial Statement Analysis With Large Language Models Alex Kim, Maximilian Muhn, Valeri Nikolaev
- Yi: Open Foundation Models By 01.AI 01. Ai et al.
- Large Language Models For Data Annotation And Synthesis: A Survey Zhen Tan et al.
- Quality Of Answers Of Generative Large Language Models Vs Peer Patients For Interpreting Lab Test Results For Lay Patients: Evaluation Study Zhe He et al.
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites Zhe Chen et al.
- A Survey On Lora Of Large Language Models Yuren Mao et al.
- Large Language Model (LLM) AI Text Generation Detection Based On Transformer Deep Learning Algorithm Yuhong Mo, Hao Qin, Yushan Dong, Ziyi Zhu, Zhenglin Li
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- How Johnny Can Persuade Llms To Jailbreak Them: Rethinking Persuasion To Challenge AI Safety By Humanizing Llms Yi Zeng et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- A Survey On RAG Meeting Llms: Towards Retrieval-augmented Large Language Models Wenqi Fan et al.
- Findings Of The Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Alex Warstadt et al.
🏷 Multimodal Models
- Attention Strategies For Multi-source Sequence-to-sequence Learning Jindřich Libovický, Jindřich Helcl
- The Memad Submission To The WMT18 Multimodal Translation Task Stig-arne Grönroos et al.
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- UNIMO: Towards Unified-modal Understanding And Generation Via Cross-modal Contrastive Learning Wei Li et al.
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Towards Learning A Generic Agent For Vision-and-language Navigation Via Pre-training Weituo Hao, Chunyuan Li, Xiujun Li, Lawrence Carin, Jianfeng Gao
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- M3P: Learning Universal Representations Via Multitask Multilingual Multimodal Pre-training Minheng Ni et al.
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- Auto-captions On GIF: A Large-scale Video-sentence Dataset For Vision-language Pre-training Yingwei Pan et al.
- Look Before You Speak: Visually Contextualized Utterances Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- DSTC8-AVSD: Multimodal Semantic Transformer Network With Retrieval Style Word Generator Hwanhee Lee et al.
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- TAP: Text-aware Pre-training For Text-vqa And Text-caption Zhengyuan Yang et al.
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- Wenlan: Bridging Vision And Language By Large-scale Multi-modal Pre-training Yuqi Huo et al.
- Unifying Multimodal Transformer For Bi-directional Image And Text Generation Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu
- Multimodal Dialogue Response Generation Qingfeng Sun et al.
- Advancing High-resolution Video-language Representation With Large-scale Video Transcriptions Hongwei Xue et al.
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- Vision-and-language Or Vision-for-language? On Cross-modal Influence In Multimodal Transformers Stella Frank, Emanuele Bugliarello, Desmond Elliott
- KAT: A Knowledge Augmented Transformer For Vision-and-language Liangke Gui et al.
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- UC2: Universal Cross-lingual Cross-modal Vision-and-language Pre-training Mingyang Zhou et al.
- MAGMA -- Multimodal Augmentation Of Generative Models Through Adapter-based Finetuning Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
- Multimodal Transformer With Variable-length Memory For Vision-and-language Navigation Chuang Lin et al.
- N\"UWA: Visual Synthesis Pre-training For Neural Visual World Creation Chenfei Wu et al.
- Simvlm: Simple Visual Language Model Pretraining With Weak Supervision Zirui Wang et al.
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- Vision Guided Generative Pre-trained Language Models For Multimodal Abstractive Summarization Tiezheng Yu, Wenliang Dai, Zihan Liu, Pascale Fung
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- Worst Of Both Worlds: Biases Compound In Pre-trained Vision-and-language Models Tejas Srinivasan, Yonatan Bisk
- Image Captioning For Effective Use Of Language Models In Knowledge-based Visual Question Answering Ander Salaberria, Gorka Azkune, Oier Lopez De Lacalle, Aitor Soroa, Eneko Agirre
- FLAVA: A Foundational Language And Vision Alignment Model Amanpreet Singh et al.
- An Empirical Study Of GPT-3 For Few-shot Knowledge-based VQA Zhengyuan Yang et al.
- Webqa: Multihop And Multimodal QA Yingshan Chang et al.
- Episodic Transformer For Vision-and-language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun
- KM-BART: Knowledge Enhanced Multimodal BART For Visual Commonsense Generation Yiran Xing et al.
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- Visqa: X-raying Vision And Language Reasoning In Transformers Theo Jaunet et al.
- OPT: Omni-perception Pre-trainer For Cross-modal Understanding And Generation Jing Liu et al.
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- SIMMC 2.0: A Task-oriented Dialog Dataset For Immersive Multimodal Conversations Satwik Kottur, Seungwhan Moon, Alborz Geramifard, Babak Damavandi
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- Multimodal Few-shot Learning With Frozen Language Models Maria Tsimpoukelli et al.
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Unitab: Unifying Text And Box Outputs For Grounded Vision-language Modeling Zhengyuan Yang et al.
- Unifying Vision-and-language Tasks Via Text Generation Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Learning To Prompt For Vision-language Models Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
- FILM: Following Instructions In Language With Modular Methods So Yeon Min, Devendra Singh Chaplot, Pradeep Ravikumar, Yonatan Bisk, Ruslan Salakhutdinov
- Robotic Skill Acquisition Via Instruction Augmentation With Vision-language Models Ted Xiao et al.
- CLIP-TD: CLIP Targeted Distillation For Vision-language Tasks Zhecan Wang et al.
- Enabling Multimodal Generation On CLIP Via Vision-language Knowledge Distillation Wenliang Dai et al.
- Murag: Multimodal Retrieval-augmented Generator For Open Question Answering Over Images And Text Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen
- Prompt Tuning For Generative Multimodal Pretrained Models Hao Yang et al.
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- Pali: A Jointly-scaled Multilingual Language-image Model Xi Chen et al.
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- BLIP: Bootstrapping Language-image Pre-training For Unified Vision-language Understanding And Generation Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
- VLC-BERT: Visual Question Answering With Contextualized Commonsense Knowledge Sahithya Ravi, Aditya Chinchure, Leonid Sigal, Renjie Liao, Vered Shwartz
- Vision-language Pre-training With Triple Contrastive Learning Jinyu Yang et al.
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Coca: Contrastive Captioners Are Image-text Foundation Models Jiahui Yu et al.
- Revisiting The "video" In Video-language Understanding Shyamal Buch et al.
- Flamingo: A Visual Language Model For Few-shot Learning Jean-baptiste Alayrac et al.
- Dall-eval: Probing The Reasoning Skills And Social Biases Of Text-to-image Generation Models Jaemin Cho, Abhay Zala, Mohit Bansal
- When And Why Vision-language Models Behave Like Bags-of-words, And What To Do About It? Mert Yuksekgonul, Federico Bianchi, Pratyusha Kalluri, Dan Jurafsky, James Zou
- Can Machines Help Us Answering Question 16 In Datasheets, And In Turn Reflecting On Inappropriate Content? Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
- Mixgen: A New Multi-modal Data Augmentation Xiaoshuai Hao et al.
- Vindlu: A Recipe For Effective Video-and-language Pretraining Feng Cheng et al.
- Vision-language Intelligence: Tasks, Representation Learning, And Large Models Feng Li et al.
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Matcha: Enhancing Visual Language Pretraining With Math Reasoning And Chart Derendering Fangyu Liu et al.
- Prompt Distribution Learning Yuning Lu, Jianzhuang Liu, Yonggang Zhang, Yajing Liu, Xinmei Tian
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- IGLUE: A Benchmark For Transfer Learning Across Modalities, Tasks, And Languages Emanuele Bugliarello et al.
- CREPE: Can Vision-language Foundation Models Reason Compositionally? Zixian Ma et al.
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- Altclip: Altering The Language Encoder In CLIP For Extended Language Capabilities Zhongzhi Chen et al.
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Visual-language Navigation Pretraining Via Prompt-based Environmental Self-exploration Xiwen Liang, Fengda Zhu, Lingling Li, Hang Xu, Xiaodan Liang
- Unified Vision And Language Prompt Learning Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
- Mplug: Effective And Efficient Vision-language Learning By Cross-modal Skip-connections Chenliang Li et al.
- A Unified End-to-end Retriever-reader Framework For Knowledge-based VQA Yangyang Guo et al.
- Scaling Language-image Pre-training Via Masking Yanghao Li, Haoqi Fan, Ronghang Hu, Christoph Feichtenhofer, Kaiming He
- Texts As Images In Prompt Tuning For Multi-label Image Recognition Zixian Guo et al.
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Multimodal Knowledge Alignment With Reinforcement Learning Youngjae Yu et al.
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Socratic Models: Composing Zero-shot Multimodal Reasoning With Language Andy Zeng et al.
- Dual Modality Prompt Tuning For Vision-language Pre-trained Model Yinghui Xing et al.
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- REVEAL: Retrieval-augmented Visual-language Pre-training With Multi-source Multimodal Knowledge Memory Ziniu Hu et al.
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- "this Is My Unicorn, Fluffy": Personalizing Frozen Vision-language Representations Niv Cohen, Rinon Gal, Eli A. Meirom, Gal Chechik, Yuval Atzmon
- Learning To Compose Soft Prompts For Compositional Zero-shot Learning Nihal V. Nayak, Peilin Yu, Stephen H. Bach
- Vl-checklist: Evaluating Pre-trained Vision-language Models With Objects, Attributes And Relations Tiancheng Zhao et al.
- Maple: Multi-modal Prompt Learning Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Make-a-video: Text-to-video Generation Without Text-video Data Uriel Singer et al.
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- From Image To Language: A Critical Analysis Of Visual Question Answering (VQA) Approaches, Challenges, And Opportunities Md Farhan Ishmam, Md Sakib Hossain Shovon, M. F. Mridha, Nilanjan Dey
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- Internvl: Scaling Up Vision Foundation Models And Aligning For Generic Visual-linguistic Tasks Zhe Chen et al.
- Applenet: Visual Attention Parameterized Prompt Learning For Few-shot Remote Sensing Image Generalization Using CLIP Mainak Singha, Ankit Jha, Bhupendra Solanki, Shirsha Bose, Biplab Banerjee
- Generative Artificial Intelligence In Learning Analytics: Contextualising Opportunities And Challenges Through The Learning Analytics Cycle Lixiang Yan, Roberto Martinez-maldonado, Dragan Gašević
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- A Survey On Hallucination In Large Language Models: Principles, Taxonomy, Challenges, And Open Questions Lei Huang et al.
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Geochat: Grounded Large Vision-language Model For Remote Sensing Kartik Kuckreja et al.
- Waffling Around For Performance: Visual Classification With Random Words And Broad Concepts Karsten Roth et al.
- Tinyclip: CLIP Distillation Via Affinity Mimicking And Weight Inheritance Kan Stephen Wu et al.
- ALIP: Adaptive Language-image Pre-training With Synthetic Caption Kaicheng Yang et al.
- Evaluation And Analysis Of Hallucination In Large Vision-language Models Junyang Wang et al.
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- Honeybee: Locality-enhanced Projector For Multimodal LLM Junbum Cha, Wooyoung Kang, Jonghwan Mun, Byungseok Roh
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- Generating Images With Multimodal Language Models Jing Yu Koh, Daniel Fried, Ruslan Salakhutdinov
- Grounding Language Models To Images For Multimodal Inputs And Outputs Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Unified-io 2: Scaling Autoregressive Multimodal Models With Vision, Language, Audio, And Action Jiasen Lu et al.
- Set-of-mark Prompting Unleashes Extraordinary Visual Grounding In GPT-4V Jianwei Yang et al.
- Imagebind-llm: Multi-modality Instruction Tuning Jiaming Han et al.
- Onellm: One Framework To Align All Modalities With Language Jiaming Han et al.
- Llm-grounder: Open-vocabulary 3D Visual Grounding With Large Language Model As An Agent Jianing Yang et al.
- Ureader: Universal Ocr-free Visually-situated Language Understanding With Multimodal Large Language Model Jiabo Ye et al.
- Physically Grounded Vision-language Models For Robotic Manipulation Jensen Gao et al.
- More Robots Are Coming: Large Multimodal Models (chatgpt) Can Solve Visually Diverse Images Of Parsons Problems Irene Hou et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Scaling Vision-language Models With Sparse Mixture Of Experts Sheng Shen et al.
- Language Is Not All You Need: Aligning Perception With Language Models Shaohan Huang et al.
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Prompting For Multimodal Hateful Meme Classification Rui Cao, Roy Ka-wei Lee, Wen-haw Chong, Jing Jiang
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Retrieval-augmented Image Captioning Rita Ramos, Desmond Elliott, Bruno Martins
- Pro-cap: Leveraging A Frozen Vision-language Model For Hateful Meme Detection Rui Cao et al.
- Mplug-owl: Modularization Empowers Large Language Models With Multimodality Qinghao Ye et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Visually-prompted Language Model For Fine-grained Scene Graph Generation In An Open World Qifan Yu et al.
- Chat-univi: Unified Visual Representation Empowers Large Language Models With Image And Video Understanding Peng Jin, Ryuichi Takanobu, Wancai Zhang, Xiaochun Cao, Li Yuan
- Audiopalm: A Large Language Model That Can Speak And Listen Paul K. Rubenstein et al.
- Internlm-xcomposer: A Vision-language Large Model For Advanced Text-image Comprehension And Composition Pan Zhang et al.
- GPT-4 Technical Report Openai et al.
- Fusecap: Leveraging Large Language Models For Enriched Fused Image Captions Noam Rotstein, David Bensaid, Shaked Brody, Roy Ganz, Ron Kimmel
- Are Aligned Neural Networks Adversarially Aligned? Nicholas Carlini et al.
- Large Language Models Are Zero-shot Time Series Forecasters Nate Gruver, Marc Finzi, Shikai Qiu, Andrew Gordon Wilson
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Hallusionbench: An Advanced Diagnostic Suite For Entangled Language Hallucination And Visual Illusion In Large Vision-language Models Tianrui Guan et al.
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- LL3DA: Visual Interactive Instruction Tuning For Omni-3d Understanding, Reasoning, And Planning Sijin Chen et al.
- Mitigating Object Hallucinations In Large Vision-language Models Through Visual Contrastive Decoding Sicong Leng et al.
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Timechat: A Time-sensitive Multimodal Large Language Model For Long Video Understanding Shuhuai Ren, Linli Yao, Shicheng Li, Xu Sun, Lu Hou
- Mm-vet: Evaluating Large Multimodal Models For Integrated Capabilities Weihao Yu et al.
- Alpha-clip: A CLIP Model Focusing On Wherever You Want Zeyi Sun et al.
- BLIVA: A Simple Multimodal LLM For Better Handling Of Text-rich Visual Questions Wenbo Hu et al.
- Ferret: Refer And Ground Anything Anywhere At Any Granularity Haoxuan You et al.
- Improved Baselines With Visual Instruction Tuning Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
- Visual Instruction Tuning Haotian Liu, Chunyuan Li, Qingyang Wu, Yong Jae Lee
- Glamm: Pixel Grounding Large Multimodal Model Hanoona Rasheed et al.
- Video-llama: An Instruction-tuned Audio-visual Language Model For Video Understanding Hang Zhang, Xin Li, Lidong Bing
- Mplug-2: A Modularized Multi-modal Foundation Model Across Text, Image And Video Haiyang Xu et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Cheap And Quick: Efficient Vision-language Instruction Tuning For Large Language Models Gen Luo et al.
- Gemini: A Family Of Highly Capable Multimodal Models Gemini Team et al.
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Aligning Large Multimodal Models With Factually Augmented RLHF Zhiqing Sun et al.
- Kosmos-2: Grounding Multimodal Large Language Models To The World Zhiliang Peng et al.
- Speechgpt: Empowering Large Language Models With Intrinsic Cross-modal Conversational Abilities Dong Zhang et al.
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- The Vector Grounding Problem Dimitri Coelho Mollo, Raphaël Millière
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Palm-e: An Embodied Multimodal Language Model Danny Driess et al.
- Llava-med: Training A Large Language-and-vision Assistant For Biomedicine In One Day Chunyuan Li et al.
- Multimodal Foundation Models: From Specialists To General-purpose Assistants Chunyuan Li et al.
- Drivelm: Driving With Graph Visual Question Answering Chonghao Sima et al.
- Debiasing Vision-language Models Via Biased Prompts Ching-yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka
- MME: A Comprehensive Evaluation Benchmark For Multimodal Large Language Models Chaoyou Fu et al.
- Hallucination Augmented Contrastive Learning For Multimodal Large Language Model Chaoya Jiang et al.
- Compositional Chain-of-thought Prompting For Large Multimodal Models Chancharik Mitra, Brandon Huang, Trevor Darrell, Roei Herzig
- MIMIC-IT: Multi-modal In-context Instruction Tuning Bo Li et al.
- Seed-bench-2: Benchmarking Multimodal Large Language Models Bohao Li et al.
- Video-llava: Learning United Visual Representation By Alignment Before Projection Bin Lin et al.
- Vtimellm: Empower LLM To Grasp Video Moments Bin Huang, Xin Wang, Hong Chen, Zihan Song, Wenwu Zhu
- RT-2: Vision-language-action Models Transfer Web Knowledge To Robotic Control Anthony Brohan et al.
- Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models Anas Awadalla et al.
- What Does CLIP Know About A Red Circle? Visual Prompt Engineering For Vlms Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
- Clipsyntel: CLIP And LLM Synergy For Multimodal Question Summarization In Healthcare Akash Ghosh et al.
- MM-REACT: Prompting Chatgpt For Multimodal Reasoning And Action Zhengyuan Yang et al.
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- Beyond Chain-of-thought, Effective Graph-of-thought Reasoning In Language Models Yao Yao, Zuchao Li, Hai Zhao
- Chatpose: Chatting About 3D Human Pose Yao Feng et al.
- 3D-LLM: Injecting The 3D World Into Large Language Models Yining Hong et al.
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- A Comprehensive Survey Of Ai-generated Content (AIGC): A History Of Generative AI From GAN To Chatgpt Yihan Cao et al.
- Evaluating Object Hallucination In Large Vision-language Models Yifan Li et al.
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- Analyzing And Mitigating Object Hallucination In Large Vision-language Models Yiyang Zhou et al.
- Pandagpt: One Model To Instruction-follow Them All Yixuan Su et al.
- Bubogpt: Enabling Visual Grounding In Multi-modal Llms Yang Zhao et al.
- Chat With The Environment: Interactive Multimodal Perception Using Large Language Models Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
- Wavcaps: A Chatgpt-assisted Weakly-labelled Audio Captioning Dataset For Audio-language Multimodal Research Xinhao Mei et al.
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Delving Into Multimodal Prompting For Fine-grained Visual Classification Xin Jiang et al.
- LISA: Reasoning Segmentation Via Large Language Model Xin Lai et al.
- MMMU: A Massive Multi-discipline Multimodal Understanding And Reasoning Benchmark For Expert AGI Xiang Yue et al.
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- Pali-3 Vision Language Models: Smaller, Faster, Stronger Xi Chen et al.
- M3exam: A Multilingual, Multimodal, Multilevel Benchmark For Examining Large Language Models Wenxuan Zhang, Sharifah Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- LIV: Language-image Representations And Rewards For Robotic Control Yecheng Jason Ma et al.
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- Contextual Object Detection With Multimodal Large Language Models Yuhang Zang, Wei Li, Jun Han, Kaiyang Zhou, Chen Change Loy
- Preventing Zero-shot Transfer Degradation In Continual Learning Of Vision-language Models Zangwei Zheng et al.
- Llava-mr: Large Language-and-vision Assistant For Video Moment Retrieval Weiheng Lu et al.
- Earthgpt: A Universal Multi-modal Large Language Model For Multi-sensor Image Comprehension In Remote Sensing Domain Wei Zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Mao
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Eyes Wide Shut? Exploring The Visual Shortcomings Of Multimodal Llms Shengbang Tong et al.
- Hidden Flaws Behind Expert-level Accuracy Of Multimodal GPT-4 Vision In Medicine Qiao Jin et al.
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- SNIFFER: Multimodal Large Language Model For Explainable Out-of-context Misinformation Detection Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee
- When Large Language Model Agents Meet 6G Networks: Perception, Grounding, And Alignment Minrui Xu et al.
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context Gemini Team et al.
- The Revolution Of Multimodal Large Language Models: A Survey Davide Caffagni et al.
- Generative AI In EU Law: Liability, Privacy, Intellectual Property, And Cybersecurity Claudio Novelli, Federico Casolari, Philipp Hacker, Giorgio Spedicato, Luciano Floridi
- MM1: Methods, Analysis & Insights From Multimodal LLM Pre-training Brandon Mckinzie et al.
- Moe-llava: Mixture Of Experts For Large Vision-language Models Bin Lin et al.
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- Gemini Goes To Med School: Exploring The Capabilities Of Multimodal Large Language Models On Medical Challenge Problems & Hallucinations Ankit Pal, Malaikannan Sankarasubbu
- Yi: Open Foundation Models By 01.AI 01. Ai et al.
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites Zhe Chen et al.
- A Review Of Modern Recommender Systems Using Generative Models (gen-recsys) Yashar Deldjoo et al.
- Searching For Best Practices In Retrieval-augmented Generation Xiaohua Wang et al.
🏷 NeurIPS
🏷 Pre-Training
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- Improving Machine Reading Comprehension With General Reading Strategies Kai Sun, Dian Yu, Dong Yu, Claire Cardie
- BERT: Pre-training Of Deep Bidirectional Transformers For Language Understanding Jacob Devlin, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Align, Mask And Select: A Simple Method For Incorporating Commonsense Knowledge Into Language Representation Models Zhi-xiu Ye, Qian Chen, Wen Wang, Zhen-hua Ling
- Olmpics -- On What Language Model Pre-training Captures Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant
- Cross-lingual Natural Language Generation Via Pre-training Zewen Chi et al.
- PEGASUS: Pre-training With Extracted Gap-sentences For Abstractive Summarization Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- UER: An Open-source Toolkit For Pre-training Models Zhe Zhao et al.
- A Pre-training Based Personalized Dialogue Generation Model With Persona-sparse Data Yinhe Zheng, Rongsheng Zhang, Xiaoxi Mao, Minlie Huang
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- VL-BERT: Pre-training Of Generic Visual-linguistic Representations Weijie Su et al.
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- A Simple But Effective Method To Incorporate Multi-turn Context With BERT For Conversational Machine Comprehension Yasuhito Ohsugi, Itsumi Saito, Kyosuke Nishida, Hisako Asano, Junji Tomita
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Dialogpt: Large-scale Generative Pre-training For Conversational Response Generation Yizhe Zhang et al.
- Visualizing And Understanding The Effectiveness Of BERT Yaru Hao, Li Dong, Furu Wei, Ke Xu
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Exploring The Limits Of Transfer Learning With A Unified Text-to-text Transformer Colin Raffel et al.
- UNIMO: Towards Unified-modal Understanding And Generation Via Cross-modal Contrastive Learning Wei Li et al.
- Low-rank Bottleneck In Multi-head Attention Models Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- XGLUE: A New Benchmark Dataset For Cross-lingual Pre-training, Understanding And Generation Yaobo Liang et al.
- Knowledge-driven Data Construction For Zero-shot Evaluation In Commonsense Question Answering Kaixin Ma et al.
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- KRISP: Integrating Implicit And Symbolic Knowledge For Open-domain Knowledge-based VQA Kenneth Marino, Xinlei Chen, Devi Parikh, Abhinav Gupta, Marcus Rohrbach
- Injecting Numerical Reasoning Skills Into Language Models Mor Geva, Ankit Gupta, Jonathan Berant
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Towards Learning A Generic Agent For Vision-and-language Navigation Via Pre-training Weituo Hao, Chunyuan Li, Xiujun Li, Lawrence Carin, Jianfeng Gao
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- M3P: Learning Universal Representations Via Multitask Multilingual Multimodal Pre-training Minheng Ni et al.
- Scaling Laws For Neural Language Models Jared Kaplan et al.
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- A Large-scale Chinese Short-text Conversation Dataset Yida Wang et al.
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- Multilingual Denoising Pre-training For Neural Machine Translation Yinhan Liu et al.
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- Auto-captions On GIF: A Large-scale Video-sentence Dataset For Vision-language Pre-training Yingwei Pan et al.
- Contrastive Code Representation Learning Paras Jain et al.
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training Weizhen Qi et al.
- Encoding Syntactic Knowledge In Transformer Encoder For Intent Detection And Slot Filling Jixuan Wang, Kai Wei, Martin Radfar, Weiwei Zhang, Clement Chung
- ETC: Encoding Long And Structured Inputs In Transformers Joshua Ainslie et al.
- Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue Shikib Mehri, Mihail Eric, Dilek Hakkani-tur
- Language Models Are Few-shot Learners Tom B. Brown et al.
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- Incorporating External Knowledge Through Pre-training For Natural Language To Code Generation Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- Pre-training Via Paraphrasing Mike Lewis et al.
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- XLM-T: Scaling Up Multilingual Machine Translation With Pretrained Cross-lingual Transformer Encoders Shuming Ma et al.
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- TAP: Text-aware Pre-training For Text-vqa And Text-caption Zhengyuan Yang et al.
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- Codified Audio Language Modeling Learns Useful Representations For Music Information Retrieval Rodrigo Castellon, Chris Donahue, Percy Liang
- Wenlan: Bridging Vision And Language By Large-scale Multi-modal Pre-training Yuqi Huo et al.
- G-transformer For Document-level Machine Translation Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- Advancing High-resolution Video-language Representation With Large-scale Video Transcriptions Hongwei Xue et al.
- MWP-BERT: Numeracy-augmented Pre-training For Math Word Problem Solving Zhenwen Liang et al.
- Learning Rich Representation Of Keyphrases From Text Mayank Kulkarni, Debanjan Mahata, Ravneet Arora, Rajarshi Bhowmik
- EVA: An Open-domain Chinese Dialogue System With Large-scale Generative Pre-training Hao Zhou et al.
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- Unsupervised Corpus Aware Language Model Pre-training For Dense Passage Retrieval Luyu Gao, Jamie Callan
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Improving Gender Fairness Of Pre-trained Language Models Without Catastrophic Forgetting Zahra Fatemi, Chen Xing, Wenhao Liu, Caiming Xiong
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- XLM-E: Cross-lingual Language Model Pre-training Via ELECTRA Zewen Chi et al.
- Clip4caption: CLIP For Video Caption Mingkang Tang et al.
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- Compressing Visual-linguistic Model Via Knowledge Distillation Zhiyuan Fang et al.
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- UC2: Universal Cross-lingual Cross-modal Vision-and-language Pre-training Mingyang Zhou et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- PPT: Pre-trained Prompt Tuning For Few-shot Learning Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang
- Ext5: Towards Extreme Multi-task Scaling For Transfer Learning Vamsi Aribandi et al.
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- The Stability-efficiency Dilemma: Investigating Sequence Length Warmup For Training GPT Models Conglong Li, Minjia Zhang, Yuxiong He
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-image Pre-training Paradigm Yangguang Li et al.
- Structurallm: Structural Pre-training For Form Understanding Chenliang Li et al.
- Prompting Visual-language Models For Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
- N\"UWA: Visual Synthesis Pre-training For Neural Visual World Creation Chenfei Wu et al.
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- Scale Efficiently: Insights From Pre-training And Fine-tuning Transformers Yi Tay et al.
- Are Pre-trained Convolutions Better Than Pre-trained Transformers? Yi Tay et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- HTLM: Hyper-text Pre-training And Prompting Of Language Models Armen Aghajanyan et al.
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- Denseclip: Language-guided Dense Prediction With Context-aware Prompting Yongming Rao et al.
- A General Language Assistant As A Laboratory For Alignment Amanda Askell et al.
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- Multi-task Pre-training For Plug-and-play Task-oriented Dialogue System Yixuan Su et al.
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- MT6: Multilingual Pretrained Text-to-text Transformer With Translation Pairs Zewen Chi et al.
- Indicbart: A Pre-trained Model For Indic Natural Language Generation Raj Dabre et al.
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Unlocking Compositional Generalization In Pre-trained Models Using Intermediate Representations Jonathan Herzig et al.
- Open Domain Question Answering Over Tables Via Dense Retrieval Jonathan Herzig, Thomas Müller, Syrine Krichene, Julian Martin Eisenschlos
- OPT: Omni-perception Pre-trainer For Cross-modal Understanding And Generation Jing Liu et al.
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- Deltalm: Encoder-decoder Pre-training For Language Generation And Translation By Augmenting Pretrained Multilingual Encoders Shuming Ma et al.
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- Condenser: A Pre-training Architecture For Dense Retrieval Luyu Gao, Jamie Callan
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Unified Pre-training For Program Understanding And Generation Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-wei Chang
- GALAXY: A Generative Pre-trained Model For Task-oriented Dialog With Semi-supervised Learning And Explicit Policy Injection Wanwei He et al.
- Longt5: Efficient Text-to-text Transformer For Long Sequences Mandy Guo et al.
- Learning To Prompt For Vision-language Models Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- Robertuito: A Pre-trained Language Model For Social Media Text In Spanish Juan Manuel Pérez, Damián A. Furman, Laura Alonso Alemany, Franco Luque
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Revisiting Neural Scaling Laws In Language And Vision Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
- Enabling Multimodal Generation On CLIP Via Vision-language Knowledge Distillation Wenliang Dai et al.
- EVA2.0: Investigating Open-domain Chinese Dialogue Systems With Large-scale Pre-training Yuxian Gu et al.
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- BLIP: Bootstrapping Language-image Pre-training For Unified Vision-language Understanding And Generation Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Evolution Through Large Models Joel Lehman et al.
- Vision-language Pre-training With Triple Contrastive Learning Jinyu Yang et al.
- Lilt: A Simple Yet Effective Language-independent Layout Transformer For Structured Document Understanding Jiapeng Wang, Lianwen Jin, Kai Ding
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Adapting Pre-trained Language Models To African Languages Via Multilingual Adaptive Fine-tuning Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- BERTIN: Efficient Pre-training Of A Spanish Language Model Using Perplexity Sampling Javier De La Rosa et al.
- Instructionner: A Multi-task Instruction-based Generative Framework For Few-shot NER Liwen Wang et al.
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- Mixgen: A New Multi-modal Data Augmentation Xiaoshuai Hao et al.
- Pangu-coder: Program Synthesis With Function-level Language Modeling Fenia Christopoulou et al.
- Vision-language Intelligence: Tasks, Representation Learning, And Large Models Feng Li et al.
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- Factpegasus: Factuality-aware Pre-training And Fine-tuning For Abstractive Summarization David Wan, Mohit Bansal
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Democratizing Contrastive Language-image Pre-training: A CLIP Benchmark Of Data, Model, And Supervision Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao
- Learning Video Representations From Large Language Models Yue Zhao, Ishan Misra, Philipp Krähenbühl, Rohit Girdhar
- A Unified End-to-end Retriever-reader Framework For Knowledge-based VQA Yangyang Guo et al.
- Scaling Language-image Pre-training Via Masking Yanghao Li, Haoqi Fan, Ronghang Hu, Christoph Feichtenhofer, Kaiming He
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- LERT: A Linguistically-motivated Pre-trained Language Model Yiming Cui, Wanxiang Che, Shijin Wang, Ting Liu
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Language Model Compression With Weighted Low-rank Factorization Yen-chang Hsu et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- REVEAL: Retrieval-augmented Visual-language Pre-training With Multi-source Multimodal Knowledge Memory Ziniu Hu et al.
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- Large Language Models Struggle To Learn Long-tail Knowledge Nikhil Kandpal, Haikang Deng, Adam Roberts, Eric Wallace, Colin Raffel
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Fine-tuned Language Models Are Continual Learners Thomas Scialom, Tuhin Chakrabarty, Smaranda Muresan
- KALA: Knowledge-augmented Language Model Adaptation Minki Kang, Jinheon Baek, Sung Ju Hwang
- An Empirical Study Of End-to-end Video-language Transformers With Masked Visual Modeling Tsu-jui Fu et al.
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- From Image To Language: A Critical Analysis Of Visual Question Answering (VQA) Approaches, Challenges, And Opportunities Md Farhan Ishmam, Md Sakib Hossain Shovon, M. F. Mridha, Nilanjan Dey
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- Improving CLIP Training With Language Rewrites Lijie Fan, Dilip Krishnan, Phillip Isola, Dina Katabi, Yonglong Tian
- Improving Text Embeddings With Large Language Models Liang Wang et al.
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- Just Ask For Calibration: Strategies For Eliciting Calibrated Confidence Scores From Language Models Fine-tuned With Human Feedback Katherine Tian et al.
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- ALIP: Adaptive Language-image Pre-training With Synthetic Caption Kaicheng Yang et al.
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, Chun Zuo
- GQA: Training Generalized Multi-query Transformer Models From Multi-head Checkpoints Joshua Ainslie et al.
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- Prompt-and-align: Prompt-based Social Alignment For Few-shot Fake News Detection Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Unified-io 2: Scaling Autoregressive Multimodal Models With Vision, Language, Audio, And Action Jiasen Lu et al.
- Unlearn What You Want To Forget: Efficient Unlearning For Llms Jiaao Chen, Diyi Yang
- VILA: On Pre-training For Visual Language Models Ji Lin et al.
- Auditing Large Language Models: A Three-layered Approach Jakob Mökander, Jonas Schuett, Hannah Rose Kirk, Luciano Floridi
- Evaluation Of Chatgpt On Biomedical Tasks: A Zero-shot Comparison With Fine-tuned Generative Transformers Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- Self-chained Image-language Model For Video Localization And Question Answering Shoubin Yu, Jaemin Cho, Prateek Yadav, Mohit Bansal
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Scalable Educational Question Generation With Pre-trained Language Models Sahan Bulathwela, Hamze Muse, Emine Yilmaz
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Fusecap: Leveraging Large Language Models For Enriched Fused Image Captions Noam Rotstein, David Bensaid, Shaked Brody, Roy Ganz, Ron Kimmel
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- Emergent And Predictable Memorization In Large Language Models Stella Biderman et al.
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Promptcblue: A Chinese Prompt Tuning Benchmark For The Medical Domain Wei Zhu, Xiaoling Wang, Huanran Zheng, Mosha Chen, Buzhou Tang
- Alpha-clip: A CLIP Model Focusing On Wherever You Want Zeyi Sun et al.
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Dr Chatgpt, Tell Me What I Want To Hear: How Prompt Knowledge Impacts Health Answer Correctness Guido Zuccon, Bevan Koopman
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Cheap And Quick: Efficient Vision-language Instruction Tuning For Large Language Models Gen Luo et al.
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Speechgpt: Empowering Large Language Models With Intrinsic Cross-modal Conversational Abilities Dong Zhang et al.
- The Vector Grounding Problem Dimitri Coelho Mollo, Raphaël Millière
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Baichuan 2: Open Large-scale Language Models Aiyuan Yang et al.
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- Xuanyuan 2.0: A Large Chinese Financial Chat Model With Hundreds Of Billions Parameters Xuanyu Zhang, Qing Yang, Dongliang Xu
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Toolqa: A Dataset For LLM Question Answering With External Tools Yuchen Zhuang, Yue Yu, Kuan Wang, Haotian Sun, Chao Zhang
- Preventing Zero-shot Transfer Degradation In Continual Learning Of Vision-language Models Zangwei Zheng et al.
- Eyes Wide Shut? Exploring The Visual Shortcomings Of Multimodal Llms Shengbang Tong et al.
- Me Llama: Foundation Large Language Models For Medical Applications Qianqian Xie et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- MM1: Methods, Analysis & Insights From Multimodal LLM Pre-training Brandon Mckinzie et al.
- AI And Memory Wall Amir Gholami et al.
- Does Fine-tuning Llms On New Knowledge Encourage Hallucinations? Zorik Gekhman et al.
- Llmparser: An Exploratory Study On Using Large Language Models For Log Parsing Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-hsun Chen, Shaowei Wang
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
- Datasets For Large Language Models: A Comprehensive Survey Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
🏷 Prompting
- How Can We Know What Language Models Know? Zhengbao Jiang, Frank F. Xu, Jun Araki, Graham Neubig
- Say What I Want: Towards The Dark Side Of Neural Dialogue Models Haochen Liu, Tyler Derr, Zitao Liu, Jiliang Tang
- Content Planning For Neural Story Generation With Aristotelian Rescoring Seraphina Goldfarb-tarrant, Tuhin Chakrabarty, Ralph Weischedel, Nanyun Peng
- The Radicalization Risks Of GPT-3 And Advanced Neural Language Models Kris Mcguffie, Alex Newhouse
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- Autoprompt: Eliciting Knowledge From Language Models With Automatically Generated Prompts Taylor Shin, Yasaman Razeghi, Robert L. Iv Logan, Eric Wallace, Sameer Singh
- Realtoxicityprompts: Evaluating Neural Toxic Degeneration In Language Models Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, Noah A. Smith
- Grounded Language Learning Fast And Slow Felix Hill et al.
- Collaborative Storytelling With Large-scale Neural Language Models Eric Nichols, Leo Gao, Randy Gomez
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Few-shot Learning With Multilingual Language Models Xi Victoria Lin et al.
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- On Transferability Of Prompt Tuning For Natural Language Processing Yusheng Su et al.
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- P-tuning V2: Prompt Tuning Can Be Comparable To Fine-tuning Universally Across Scales And Tasks Xiao Liu et al.
- GPT Understands, Too Xiao Liu et al.
- Learning How To Ask: Querying Lms With Mixtures Of Soft Prompts Guanghui Qin, Jason Eisner
- Cutting Down On Prompts And Parameters: Simple Few-shot Learning With Language Models Robert L. Iv Logan et al.
- Reframing Instructional Prompts To Gptk's Language Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
- True Few-shot Learning With Language Models Ethan Perez, Douwe Kiela, Kyunghyun Cho
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- BERT, Mbert, Or Bibert? A Study On Contextualized Embeddings For Neural Machine Translation Haoran Xu, Benjamin Van Durme, Kenton Murray
- Prompt Programming For Large Language Models: Beyond The Few-shot Paradigm Laria Reynolds, Kyle Mcdonell
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Multitask Prompted Training Enables Zero-shot Task Generalization Victor Sanh et al.
- A Recipe For Arbitrary Text Style Transfer With Large Language Models Emily Reif et al.
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- PTR: Prompt Tuning With Rules For Text Classification Xu Han, Weilin Zhao, Ning Ding, Zhiyuan Liu, Maosong Sun
- Improving Gender Fairness Of Pre-trained Language Models Without Catastrophic Forgetting Zahra Fatemi, Chen Xing, Wenhao Liu, Caiming Xiong
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- Controllable Generation From Pre-trained Language Models Via Inverse Prompting Xu Zou et al.
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- PPT: Pre-trained Prompt Tuning For Few-shot Learning Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- Zero-shot Recommendation As Language Modeling Damien Sileo, Wout Vossen, Robbe Raymaekers
- True Few-shot Learning With Prompts -- A Real-world Perspective Timo Schick, Hinrich Schütze
- Why Do Pretrained Language Models Help In Downstream Tasks? An Analysis Of Head And Prompt Tuning Colin Wei, Sang Michael Xie, Tengyu Ma
- Meta-learning Via Language Model In-context Tuning Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He
- Exploring Prompt-based Few-shot Learning For Grounded Dialog Generation Chujie Zheng, Minlie Huang
- Dialogue State Tracking With A Language Model Using Schema-driven Prompting Chia-hsuan Lee, Hao Cheng, Mari Ostendorf
- LFPT5: A Unified Framework For Lifelong Few-shot Language Learning Based On Prompt Tuning Of T5 Chengwei Qin, Shafiq Joty
- Prompting Visual-language Models For Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
- Pre-train, Prompt, And Predict: A Systematic Survey Of Prompting Methods In Natural Language Processing Pengfei Liu et al.
- Fantastically Ordered Prompts And Where To Find Them: Overcoming Few-shot Prompt Order Sensitivity Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp
- The Power Of Scale For Parameter-efficient Prompt Tuning Brian Lester, Rami Al-rfou, Noah Constant
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners Ningyu Zhang et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- HTLM: Hyper-text Pre-training And Prompting Of Language Models Armen Aghajanyan et al.
- AI Chains: Transparent And Controllable Human-ai Interaction By Chaining Large Language Model Prompts Tongshuang Wu, Michael Terry, Carrie J. Cai
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- Characterchat: Supporting The Creation Of Fictional Characters Through Conversation And Progressive Manifestation With A Chatbot Oliver Schmitt, Daniel Buschek
- Learning To Retrieve Prompts For In-context Learning Ohad Rubin, Jonathan Herzig, Jonathan Berant
- Few-shot Bot: Prompt-based Learning For Dialogue Systems Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung
- Denseclip: Language-guided Dense Prediction With Context-aware Prompting Yongming Rao et al.
- Revisiting Self-training For Few-shot Learning Of Language Model Yiming Chen et al.
- A General Language Assistant As A Laboratory For Alignment Amanda Askell et al.
- An Empirical Study Of GPT-3 For Few-shot Knowledge-based VQA Zhengyuan Yang et al.
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- Do Prompt-based Models Really Understand The Meaning Of Their Prompts? Albert Webson, Ellie Pavlick
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- Spot: Better Frozen Model Adaptation Through Soft Prompt Transfer Tu Vu, Brian Lester, Noah Constant, Rami Al-rfou, Daniel Cer
- How Many Data Points Is A Prompt Worth? Teven Le Scao, Alexander M. Rush
- FLEX: Unifying Evaluation For Few-shot NLP Jonathan Bragg, Arman Cohan, Kyle Lo, Iz Beltagy
- Challenges In Detoxifying Language Models Johannes Welbl et al.
- Planning With Learned Entity Prompts For Abstractive Summarization Shashi Narayan et al.
- Multimodal Few-shot Learning With Frozen Language Models Maria Tsimpoukelli et al.
- What Makes Good In-context Examples For GPT-\(3\)? Jiachang Liu et al.
- Evaluating Large Language Models Trained On Code Mark Chen et al.
- Program Synthesis With Large Language Models Jacob Austin et al.
- Generated Knowledge Prompting For Commonsense Reasoning Jiacheng Liu et al.
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Learning To Prompt For Vision-language Models Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
- Adapting Language Models For Zero-shot Learning By Meta-tuning On Dataset And Prompt Collections Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein
- Reframing Human-ai Collaboration For Generating Free-text Explanations Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi
- Diverse Demonstrations Improve In-context Compositional Generalization Itay Levy, Ben Bogin, Jonathan Berant
- Progprompt: Generating Situated Robot Task Plans Using Large Language Models Ishika Singh et al.
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Evaluating Mixed-initiative Conversational Search Systems Via User Simulation Ivan Sekulić, Mohammad Aliannejadi, Fabio Crestani
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- Scaling Instruction-finetuned Language Models Hyung Won Chung et al.
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- Exploring Visual Prompts For Adapting Large-scale Models Hyojin Bahng, Ali Jahanian, Swami Sankaranarayanan, Phillip Isola
- Biobart: Pretraining And Evaluation Of A Biomedical Generative Language Model Hongyi Yuan et al.
- Selective Annotation Makes Language Models Better Few-shot Learners Hongjin Su et al.
- Demystifying Prompts In Language Models Via Perplexity Estimation Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer
- Interactive And Visual Prompt Engineering For Ad-hoc Task Adaptation With Large Language Models Hendrik Strobelt et al.
- Interleaving Retrieval With Chain-of-thought Reasoning For Knowledge-intensive Multi-step Questions Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- Repair Is Nearly Generation: Multilingual Program Repair With Llms Harshit Joshi et al.
- What Do Llms Know About Financial Markets? A Case Study On Reddit Market Sentiment Analysis Xiang Deng, Vasilisa Bashlovkina, Feng Han, Simon Baumgartner, Michael Bendersky
- Language Models As Zero-shot Planners: Extracting Actionable Knowledge For Embodied Agents Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch
- Few-shot Parameter-efficient Fine-tuning Is Better And Cheaper Than In-context Learning Haokun Liu et al.
- Prompt Tuning For Generative Multimodal Pretrained Models Hao Yang et al.
- Program Of Thoughts Prompting: Disentangling Computation From Reasoning For Numerical Reasoning Tasks Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen
- Teaching Algorithmic Reasoning Via In-context Learning Hattie Zhou et al.
- In-context Learning For Few-shot Dialogue State Tracking Yushi Hu et al.
- Large Language Models Are Few(1)-shot Table Reasoners Wenhu Chen
- Reasoning With Language Model Prompting: A Survey Shuofei Qiao et al.
- Rethinking With Retrieval: Faithful Large Language Model Inference Hangfeng He, Hongming Zhang, Dan Roth
- Decoupling Knowledge From Memorization: Retrieval-augmented Prompt Learning Xiang Chen et al.
- Diffusiondb: A Large-scale Prompt Gallery Dataset For Text-to-image Generative Models Zijie J. Wang et al.
- How To Prompt? Opportunities And Challenges Of Zero- And Few-shot Learning For Human-ai Interaction In Creative Applications Of Generative Models Hai Dang, Lukas Mecke, Florian Lehmann, Sven Goller, Daniel Buschek
- Multi-stage Prompting For Knowledgeable Dialogue Generation Zihan Liu et al.
- The Unreliability Of Explanations In Few-shot Prompting For Textual Reasoning Xi Ye, Greg Durrett
- Evaluating And Inducing Personality In Pre-trained Language Models Guangyuan Jiang et al.
- Prototypical Verbalizer For Prompt-based Few-shot Tuning Ganqu Cui, Shengding Hu, Ning Ding, Longtao Huang, Zhiyuan Liu
- Data Augmentation For Intent Classification With Off-the-shelf Large Language Models Gaurav Sahu et al.
- Promptcap: Prompt-guided Task-aware Image Captioning Yushi Hu et al.
- Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored To Political Identity Gabriel Simmons
- Language Models Are Multilingual Chain-of-thought Reasoners Freda Shi et al.
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Re3: Generating Longer Stories With Recursive Reprompting And Revision Kevin Yang, Yuandong Tian, Nanyun Peng, Dan Klein
- Chatgpt Makes Medicine Easy To Swallow: An Exploratory Case Study On Simplified Radiology Reports Katharina Jeblick et al.
- Large Language Models Encode Clinical Knowledge Karan Singhal et al.
- Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, Antonios Anastasopoulos, Yulia Tsvetkov
- Healthprompt: A Zero-shot Learning Paradigm For Clinical Natural Language Processing Sonish Sivarajkumar, Yanshan Wang
- Speechprompt: An Exploration Of Prompt Tuning On Generative Spoken Language Model For Speech Processing Tasks Kai-wei Chang, Wei-cheng Tseng, Shang-wen Li, Hung-yi Lee
- Leveraging Large Language Models For Multiple Choice Question Answering Joshua Robinson, Christopher Michael Rytting, David Wingate
- When To Make Exceptions: Exploring Language Models As Accounts Of Human Moral Judgment Zhijing Jin et al.
- OPT-IML: Scaling Language Model Instruction Meta Learning Through The Lens Of Generalization Srinivasan Iyer et al.
- Can Large Language Models Truly Understand Prompts? A Case Study With Negated Prompts Joel Jang, Seonghyeon Ye, Minjoon Seo
- News Summarization And Evaluation In The Era Of GPT-3 Tanya Goyal, Junyi Jessy Li, Greg Durrett
- Generating Sequences By Learning To Self-correct Sean Welleck et al.
- Instruction Tuning For Few-shot Aspect-based Sentiment Analysis Siddharth Varia et al.
- Are Large Pre-trained Language Models Leaking Your Personal Information? Jie Huang, Hanyin Shao, Kevin Chen-chuan Chang
- Knowledge Prompting In Pre-trained Language Model For Natural Language Understanding Jianing Wang et al.
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- Large Language Models Can Self-improve Jiaxin Huang et al.
- React: Synergizing Reasoning And Acting In Language Models Shunyu Yao et al.
- Flamingo: A Visual Language Model For Few-shot Learning Jean-baptiste Alayrac et al.
- Chain-of-thought Prompting Elicits Reasoning In Large Language Models Jason Wei et al.
- A Fine-grained Comparison Of Pragmatic Language Understanding In Humans And Language Models Jennifer Hu, Sammy Floyd, Olessia Jouravlev, Evelina Fedorenko, Edward Gibson
- Convfinqa: Exploring The Chain Of Numerical Reasoning In Conversational Finance Question Answering Zhiyu Chen et al.
- Maieutic Prompting: Logically Consistent Reasoning With Recursive Explanations Jaehun Jung et al.
- Action-gpt: Leveraging Large-scale Language Models For Improved And Generalized Action Generation Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla
- PAL: Program-aided Language Models Luyu Gao et al.
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Prompting Is Programming: A Query Language For Large Language Models Luca Beurer-kellner, Marc Fischer, Martin Vechev
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Instructionner: A Multi-task Instruction-based Generative Framework For Few-shot NER Liwen Wang et al.
- Teaching Small Language Models To Reason Lucie Charlotte Magister, Jonathan Mallinson, Jakub Adamek, Eric Malmi, Aliaksei Severyn
- In-context Examples Selection For Machine Translation Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, Marjan Ghazvininejad
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Real Or Fake Text?: Investigating Human Ability To Detect Boundaries Between Human-written And Machine-generated Text Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Sherry Shi, Chris Callison-burch
- Efficient Few-shot Learning Without Prompts Lewis Tunstall et al.
- Phenaki: Variable Length Video Generation From Open Domain Textual Description Ruben Villegas et al.
- Personalized Prompt Learning For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- Language Models That Seek For Knowledge: Modular Search & Generation For Dialogue And Prompt Completion Kurt Shuster et al.
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- Can Large Language Models Reason About Medical Questions? Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther
- Visual Prompt Tuning Menglin Jia et al.
- Towards Using Few-shot Prompt Learning For Automating Model Completion Meriem Ben Chaaben, Lola Burgueño, Houari Sahraoui
- Language Models With Image Descriptors Are Strong Few-shot Video-language Learners Zhenhailong Wang et al.
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Can Machines Help Us Answering Question 16 In Datasheets, And In Turn Reflecting On Inappropriate Content? Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
- Deplot: One-shot Visual Language Reasoning By Plot-to-table Translation Fangyu Liu et al.
- Legal Prompting: Teaching A Language Model To Think Like A Lawyer Fangyi Yu, Lee Quartey, Frank Schilder
- Red Teaming Language Models With Language Models Ethan Perez et al.
- Towards Unified Conversational Recommender Systems Via Knowledge-enhanced Prompt Learning Xiaolei Wang, Kun Zhou, Ji-rong Wen, Wayne Xin Zhao
- Codegen: An Open Large Language Model For Code With Multi-turn Program Synthesis Erik Nijkamp et al.
- Capturing Failures Of Large Language Models Via Human Cognitive Biases Erik Jones, Jacob Steinhardt
- Prompt Distribution Learning Yuning Lu, Jianzhuang Liu, Yonggang Zhang, Yajing Liu, Xinmei Tian
- Star: Bootstrapping Reasoning With Reasoning Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman
- Hyperprompt: Prompt-based Task-conditioning Of Transformers Yun He et al.
- Legal Prompt Engineering For Multilingual Legal Judgement Prediction Dietrich Trautmann, Alina Petrova, Frank Schilder
- Successive Prompting For Decomposing Complex Questions Dheeru Dua, Shivanshu Gupta, Sameer Singh, Matt Gardner
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Visual-language Navigation Pretraining Via Prompt-based Environmental Self-exploration Xiwen Liang, Fengda Zhu, Lingling Li, Hang Xu, Xiaodan Liang
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Least-to-most Prompting Enables Complex Reasoning In Large Language Models Denny Zhou et al.
- Prompting Palm For Translation: Assessing Strategies And Performance David Vilar et al.
- Adaprompt: Adaptive Model Training For Prompt-based NLP Yulong Chen et al.
- Language Model Cascades David Dohan et al.
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- Unified Vision And Language Prompt Learning Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
- Discovering Latent Knowledge In Language Models Without Supervision Collin Burns, Haotian Ye, Dan Klein, Jacob Steinhardt
- Augesc: Dialogue Augmentation With Large Language Models For Emotional Support Conversation Chujie Zheng, Sahand Sabour, Jiaxin Wen, Zheng Zhang, Minlie Huang
- Promda: Prompt-based Data Augmentation For Low-resource NLU Tasks Yufei Wang et al.
- Prompt For Extraction? PAIE: Prompting Argument Interaction For Event Argument Extraction Yubo Ma et al.
- Linearly Mapping From Image To Text Space Jack Merullo, Louis Castricato, Carsten Eickhoff, Ellie Pavlick
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- A Unified Multi-task Learning Framework For Multi-goal Conversational Recommender Systems Yang Deng et al.
- Self-consistency Improves Chain Of Thought Reasoning In Language Models Xuezhi Wang et al.
- Code4struct: Code Generation For Few-shot Event Structure Prediction Xingyao Wang, Sha Li, Heng Ji
- Texts As Images In Prompt Tuning For Multi-label Image Recognition Zixian Guo et al.
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Complexity-based Prompting For Multi-step Reasoning Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, Tushar Khot
- Exploring Length Generalization In Large Language Models Cem Anil et al.
- IDPG: An Instance-dependent Prompt Generation Method Zhuofeng Wu et al.
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Enabling Conversational Interaction With Mobile UI Using Large Language Models Bryan Wang, Gang Li, Yang Li
- Iteratively Prompt Pre-trained Language Models For Chain Of Thought Boshi Wang, Xiang Deng, Huan Sun
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- Analogy Generation By Prompting Large Language Models: A Case Study Of Instructgpt Bhavya Bhavya, Jinjun Xiong, Chengxiang Zhai
- BLOOM: A 176b-parameter Open-access Multilingual Language Model Bigscience Workshop et al.
- Multi-lingual Evaluation Of Code Generation Models Ben Athiwaratkun et al.
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- Large Language Models Are Better Reasoners With Self-verification Yixuan Weng et al.
- Automatic Chain Of Thought Prompting In Large Language Models Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola
- Grips: Gradient-free, Edit-based Instruction Search For Prompting Large Language Models Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Socratic Models: Composing Zero-shot Multimodal Reasoning With Language Andy Zeng et al.
- Qaner: Prompting Question Answering Models For Few-shot Named Entity Recognition Andy T. Liu et al.
- Making Large Language Models Better Reasoners With Step-aware Verifier Yifei Li et al.
- Compositional Semantic Parsing With Large Language Models Andrew Drozdov et al.
- Large Language Models Are Human-level Prompt Engineers Yongchao Zhou et al.
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Prompt-to-prompt Image Editing With Cross Attention Control Amir Hertz et al.
- Text And Patterns: For Effective Chain Of Thought, It Takes Two To Tango Aman Madaan, Amir Yazdanbakhsh
- Improving Alignment Of Dialogue Agents Via Targeted Human Judgements Amelia Glaese et al.
- Memory-assisted Prompt Editing To Improve GPT-3 After Deployment Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang
- Can Language Models Learn From Explanations In Context? Andrew K. Lampinen et al.
- Standing On The Shoulders Of Giant Frozen Language Models Yoav Levine et al.
- Dual Modality Prompt Tuning For Vision-language Pre-trained Model Yinghui Xing et al.
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- Dualprompt: Complementary Prompting For Rehearsal-free Continual Learning Zifeng Wang et al.
- ATTEMPT: Parameter-efficient Multi-task Tuning Via Attentional Mixtures Of Soft Prompts Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- Storydall-e: Adapting Pretrained Text-to-image Transformers For Story Continuation Adyasha Maharana, Darryl Hannan, Mohit Bansal
- LASP: Text-to-text Optimization For Language-aware Soft Prompting Of Vision & Language Models Adrian Bulat, Georgios Tzimiropoulos
- Language Models Are Greedy Reasoners: A Systematic Formal Analysis Of Chain-of-thought Abulhair Saparov, He He
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models Aarohi Shammie Srivastava et al.
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- Retrieval-augmented Generative Question Answering For Event Argument Extraction Xinya Du, Heng Ji
- Dynamic Prompt Learning Via Policy Gradient For Semi-structured Mathematical Reasoning Pan Lu et al.
- Code Generation Tools (almost) For Free? A Study Of Few-shot, Pre-trained Language Models On Code Patrick Bareiß, Beatriz Souza, Marcelo D'amorim, Michael Pradel
- Unnatural Instructions: Tuning Language Models With (almost) No Human Labor Or Honovich, Thomas Scialom, Omer Levy, Timo Schick
- On Second Thought, Let's Not Think Step By Step! Bias And Toxicity In Zero-shot Reasoning Omar Shaikh, Hongxin Zhang, William Held, Michael Bernstein, Diyi Yang
- ROSCOE: A Suite Of Metrics For Scoring Step-by-step Reasoning Olga Golovneva et al.
- Measuring And Narrowing The Compositionality Gap In Language Models Ofir Press et al.
- Grounding Language With Visual Affordances Over Unstructured Data Oier Mees, Jessica Borja-diaz, Wolfram Burgard
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- Learning To Compose Soft Prompts For Compositional Zero-shot Learning Nihal V. Nayak, Peilin Yu, Stephen H. Bach
- Quantifying Memorization Across Neural Language Models Nicholas Carlini et al.
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Crosslingual Generalization Through Multitask Finetuning Niklas Muennighoff et al.
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- Generate Rather Than Retrieve: Large Language Models Are Strong Context Generators Wenhao Yu et al.
- Maple: Multi-modal Prompt Learning Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Toxigen: A Large-scale Machine-generated Dataset For Adversarial And Implicit Hate Speech Detection Thomas Hartvigsen et al.
- Challenging Big-bench Tasks And Whether Chain-of-thought Can Solve Them Mirac Suzgun et al.
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Promptsource: An Integrated Development Environment And Repository For Natural Language Prompts Stephen H. Bach et al.
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- Co-writing Screenplays And Theatre Scripts With Language Models: An Evaluation By Industry Professionals Piotr Mirowski, Kory W. Mathewson, Jaylen Pittman, Richard Evans
- 3DALL-E: Integrating Text-to-image AI In 3D Design Workflows Vivian Liu, Jo Vermeulen, George Fitzmaurice, Justin Matejka
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Holistic Evaluation Of Language Models Percy Liang et al.
- PINTO: Faithful Language Reasoning Using Prompt-generated Rationales Peifeng Wang, Aaron Chan, Filip Ilievski, Muhao Chen, Xiang Ren
- Conversing With Copilot: Exploring Prompt Engineering For Solving CS1 Problems Using Natural Language Paul Denny, Viraj Kumar, Nasser Giacaman
- Quantifying Language Models' Sensitivity To Spurious Features In Prompt Design Or: How I Learned To Start Worrying About Prompt Formatting Melanie Sclar, Yejin Choi, Yulia Tsvetkov, Alane Suhr
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Can Llms Express Their Uncertainty? An Empirical Evaluation Of Confidence Elicitation In Llms Miao Xiong et al.
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Unleashing The Emergent Cognitive Synergy In Large Language Models: A Task-solving Agent Through Multi-persona Self-collaboration Zhenhailong Wang et al.
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Errors Are Useful Prompts: Instruction Guided Task Programming With Verifier-assisted Iterative Prompting Marta Skreta et al.
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- Dictionary-based Phrase-level Prompting Of Large Language Models For Machine Translation Marjan Ghazvininejad, Hila Gonen, Luke Zettlemoyer
- Applenet: Visual Attention Parameterized Prompt Learning For Few-shot Remote Sensing Image Generalization Using CLIP Mainak Singha, Ankit Jha, Bhupendra Solanki, Shirsha Bose, Biplab Banerjee
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Llm-grounded Diffusion: Enhancing Prompt Understanding Of Text-to-image Diffusion Models With Large Language Models Long Lian, Boyi Li, Adam Yala, Trevor Darrell
- Enhancing Few-shot Text-to-sql Capabilities Of Large Language Models: A Study On Prompt Design Strategies Linyong Nan et al.
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- Do Llms Exhibit Human-like Response Biases? A Case Study In Survey Design Lindia Tjuatja, Valerie Chen, Sherry Tongshuang Wu, Ameet Talwalkar, Graham Neubig
- Next-step Hint Generation For Introductory Programming Using Large Language Models Lianne Roest, Hieke Keuning, Johan Jeuring
- Automatically Correcting Large Language Models: Surveying The Landscape Of Diverse Self-correction Strategies Liangming Pan et al.
- Logic-lm: Empowering Large Language Models With Symbolic Solvers For Faithful Logical Reasoning Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- In-context Impersonation Reveals Large Language Models' Strengths And Biases Leonard Salewski, Stephan Alaniz, Isabel Rio-torto, Eric Schulz, Zeynep Akata
- Layoutllm-t2i: Eliciting Layout Guidance From LLM For Text-to-image Generation Leigang Qu, Shengqiong Wu, Hao Fei, Liqiang Nie, Tat-seng Chua
- Query2doc: Query Expansion With Large Language Models Liang Wang, Nan Yang, Furu Wei
- Zero-shot Next-item Recommendation Using Large Pretrained Language Models Lei Wang, Ee-peng Lim
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- Just Tell Me: Prompt Engineering In Business Process Management Kiran Busch, Alexander Rochlitzer, Diana Sola, Henrik Leopold
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- Large Language Models And Simple, Stupid Bugs Kevin Jesse, Toufique Ahmed, Premkumar T. Devanbu, Emily Morgan
- Automatic Prompt Augmentation And Selection With Chain-of-thought From Labeled Data Kashun Shum, Shizhe Diao, Tong Zhang
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Mvp: Multi-view Prompting Improves Aspect Sentiment Tuple Prediction Zhibin Gou, Qingyan Guo, Yujiu Yang
- Towards Expert-level Medical Question Answering With Large Language Models Karan Singhal et al.
- Chipgpt: How Far Are We From Natural Language Hardware Design Kaiyan Chang et al.
- Speechprompt V2: Prompt Tuning For Speech Classification Tasks Kai-wei Chang et al.
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- Agentcf: Collaborative Learning With Autonomous Language Agents For Recommender Systems Junjie Zhang et al.
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- Chatcounselor: A Large Language Models For Mental Health Support June M. Liu et al.
- Backdooring Instruction-tuned Large Language Models With Virtual Prompt Injection Jun Yan et al.
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Spear Phishing With Large Language Models Julian Hazell
- Jatmo: Prompt Injection Defense By Task-specific Finetuning Julien Piet et al.
- Exploring The Benefits Of Training Expert Language Models Over Instruction Tuning Joel Jang et al.
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- Prompt-and-align: Prompt-based Social Alignment For Few-shot Fake News Detection Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- Geotechnical Parrot Tales (GPT): Harnessing Large Language Models In Geotechnical Engineering Krishna Kumar
- Unified-io 2: Scaling Autoregressive Multimodal Models With Vision, Language, Audio, And Action Jiasen Lu et al.
- Set-of-mark Prompting Unleashes Extraordinary Visual Grounding In GPT-4V Jianwei Yang et al.
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- A Unified Generative Retriever For Knowledge-intensive Language Tasks Via Prompt Learning Jiangui Chen et al.
- Compositional Exemplars For In-context Learning Jiacheng Ye, Zhiyong Wu, Jiangtao Feng, Tao Yu, Lingpeng Kong
- LLM Lies: Hallucinations Are Not Bugs, But Features As Adversarial Examples Jia-yu Yao et al.
- GPT-3.5, GPT-4, Or BARD? Evaluating Llms Reasoning Ability In Zero-shot Setting And Performance Boosting Through Prompts Jessica López Espejel, El Hassane Ettifouri, Mahaman Sanoussi Yahaya Alassan, El Mehdi Chouham, Walid Dahhane
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Symbol Tuning Improves In-context Learning In Language Models Jerry Wei et al.
- Prompting Is Not A Substitute For Probability Measurements In Large Language Models Jennifer Hu, Roger Levy
- Evaluating Large Language Models On A Highly-specialized Topic, Radiation Oncology Physics Jason Holmes et al.
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- Chatgpt To Replace Crowdsourcing Of Paraphrases For Intent Classification: Higher Diversity And Comparable Model Robustness Jan Cegin, Jakub Simko, Peter Brusilovsky
- More Robots Are Coming: Large Multimodal Models (chatgpt) Can Solve Visually Diverse Images Of Parsons Problems Irene Hou et al.
- Chainforge: A Visual Toolkit For Prompt Engineering And LLM Hypothesis Testing Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Martin Wattenberg, Elena Glassman
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- Chatgpt Chemistry Assistant For Text Mining And Prediction Of MOF Synthesis Zhiling Zheng, Oufan Zhang, Christian Borgs, Jennifer T. Chayes, Omar M. Yaghi
- Semantic Compression With Large Language Models Henry Gilbert, Michael Sandborn, Douglas C. Schmidt, Jesse Spencer-smith, Jules White
- Bioinstruct: Instruction Tuning Of Large Language Models For Biomedical Natural Language Processing Hieu Tran, Zhichao Yang, Zonghai Yao, Hong Yu
- Boosting Theory-of-mind Performance In Large Language Models Via Prompting Shima Rahimi Moghaddam, Christopher J. Honey
- Mathprompter: Mathematical Reasoning Using Large Language Models Shima Imani, Liang Du, Harsh Shrivastava
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Large Language Model Augmented Narrative Driven Recommendations Sheshera Mysore, Andrew Mccallum, Hamed Zamani
- Toolkengpt: Augmenting Frozen Language Models With Massive Tools Via Tool Embeddings Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu
- Gorilla: Large Language Model Connected With Massive Apis Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez
- Multitask Prompt Tuning Enables Parameter-efficient Transfer Learning Zhen Wang et al.
- The Flan Collection: Designing Data And Methods For Effective Instruction Tuning Shayne Longpre et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- On Codex Prompt Engineering For OCL Generation: An Empirical Study Seif Abukhalaf, Mohammad Hamdaqa, Foutse Khomh
- Language Is Not All You Need: Aligning Perception With Language Models Shaohan Huang et al.
- Generating Phishing Attacks Using Chatgpt Sayak Saha Roy, Krishna Vamsi Naragam, Shirin Nilizadeh
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- Verify-and-edit: A Knowledge-enhanced Chain-of-thought Framework Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, Lidong Bing
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- Chatgpt Vs. Google: A Comparative Study Of Search Performance And User Experience Ruiyun Rayna Xu, Yue Katherine Feng, Hailiang Chen
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Prompting For Multimodal Hateful Meme Classification Rui Cao, Roy Ka-wei Lee, Wen-haw Chong, Jing Jiang
- Llm-assisted Content Analysis: Using Large Language Models To Support Deductive Coding Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim
- Catalyst: Domain-extensible Intervention For Preventing Task Procrastination Using Large Generative Models Riku Arakawa, Hiromu Yakura, Masataka Goto
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- Automatic Prompt Optimization With "gradient Descent" And Beam Search Reid Pryzant et al.
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Pro-cap: Leveraging A Frozen Vision-language Model For Hateful Meme Detection Rui Cao et al.
- Prompt-based Distribution Alignment For Unsupervised Domain Adaptation Shuanghao Bai et al.
- VELMA: Verbalization Embodiment Of LLM Agents For Vision And Language Navigation In Street View Raphael Schumann et al.
- How Secure Is Code Generated By Chatgpt? Raphaël Khoury, Anderson R. Avila, Jacob Brunelle, Baba Mamadou Camara
- Prompt Engineering A Prompt Engineer Qinyuan Ye, Maxamed Axmed, Reid Pryzant, Fereshte Khani
- Can Large Language Models Replace Humans In The Systematic Review Process? Evaluating Gpt-4's Efficacy In Screening And Extracting Data From Peer-reviewed And Grey Literature In Multiple Languages Qusai Khraisha, Sophie Put, Johanna Kappenberg, Azza Warraitch, Kristin Hadfield
- Translating Radiology Reports Into Plain Language Using Chatgpt And GPT-4 With Prompt Learning: Promising Results, Limitations, And Potential Qing Lyu et al.
- ONCE: Boosting Content-based Recommendation With Both Open- And Closed-source Large Language Models Qijiong Liu, Nuo Chen, Tetsuya Sakai, Xiao-ming Wu
- Faithful Chain-of-thought Reasoning Qing Lyu et al.
- Grounded Text-to-image Synthesis With Attention Refocusing Quynh Phung, Songwei Ge, Jia-bin Huang
- Genegpt: Augmenting Large Language Models With Domain Tools For Improved Access To Biomedical Information Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu
- Prompting The Hidden Talent Of Web-scale Speech Models For Zero-shot Task Generalization Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath
- Are Large Language Models Geospatially Knowledgeable? Prabin Bhandari, Antonios Anastasopoulos, Dieter Pfoser
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Large Language Models Sensitivity To The Order Of Options In Multiple-choice Questions Pouya Pezeshkpour, Estevam Hruschka
- Selfcheckgpt: Zero-resource Black-box Hallucination Detection For Generative Large Language Models Potsawee Manakul, Adian Liusie, Mark J. F. Gales
- Students' Perceptions And Preferences Of Generative Artificial Intelligence Feedback For Programming Zhengdong Zhang et al.
- Visually-prompted Language Model For Fine-grained Scene Graph Generation In An Open World Qifan Yu et al.
- Pre-train, Prompt And Recommendation: A Comprehensive Survey Of Language Modelling Paradigm Adaptations In Recommender Systems Peng Liu, Lemei Zhang, Jon Atle Gulla
- Audiopalm: A Large Language Model That Can Speak And Listen Paul K. Rubenstein et al.
- Graphologue: Exploring Large Language Model Responses With Interactive Diagrams Peiling Jiang, Jude Rayan, Steven P. Dow, Haijun Xia
- Starcoder: May The Source Be With You! Raymond Li et al.
- Going Beyond Nouns With Vision & Language Models Using Synthetic Data Paola Cascante-bonilla et al.
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- Ontochatgpt Information System: Ontology-driven Structured Prompts For Chatgpt Meta-learning Oleksandr Palagin, Vladislav Kaverinskiy, Anna Litvin, Kyrylo Malakhov
- Hallucinations In Large Multilingual Translation Models Nuno M. Guerreiro et al.
- Chatgpt Is A Knowledgeable But Inexperienced Solver: An Investigation Of Commonsense Problem In Large Language Models Ning Bian et al.
- Automated Annotation With Generative AI Requires Validation Nicholas Pangakis, Samuel Wolken, Neil Fasching
- Self-contradictory Hallucinations Of Large Language Models: Evaluation, Detection And Mitigation Niels Mündler, Jingxuan He, Slobodan Jenko, Martin Vechev
- Consistency Analysis Of Chatgpt Myeongjun Erik Jang, Thomas Lukasiewicz
- Self-regulating Prompts: Foundational Model Adaptation Without Forgetting Muhammad Uzair Khattak et al.
- Label Supervised Llama Finetuning Zongxi Li et al.
- State Of What Art? A Call For Multi-prompt LLM Evaluation Moran Mizrahi et al.
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- DIN-SQL: Decomposed In-context Learning Of Text-to-sql With Self-correction Mohammadreza Pourreza, Davood Rafiei
- Abscribe: Rapid Exploration & Organization Of Multiple Writing Variations In Human-ai Co-writing Tasks Using Large Language Models Mohi Reza et al.
- A Review Of Chatgpt Applications In Education, Marketing, Software Engineering, And Healthcare: Benefits, Drawbacks, And Research Directions Mohammad Fraiwan, Natheer Khasawneh
- Introducing Language Guidance In Prompt-based Continual Learning Muhammad Gul Zain Ali Khan et al.
- Open Sesame! Universal Black Box Jailbreaking Of Large Language Models Raz Lapid, Ron Langberg, Moshe Sipper
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Evaluating Large Language Models In Theory Of Mind Tasks Michal Kosinski
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Codekgc: Code Language Model For Generative Knowledge Graph Construction Zhen Bi et al.
- Large Language Models Are Effective Text Rankers With Pairwise Ranking Prompting Zhen Qin et al.
- Empirical Study Of Zero-shot NER With Chatgpt Tingyu Xie et al.
- Large Language Models Are State-of-the-art Evaluators Of Translation Quality Tom Kocmi, Christian Federmann
- Enabling Large Language Models To Generate Text With Citations Tianyu Gao, Howard Yen, Jiatong Yu, Danqi Chen
- Generalized Planning In PDDL Domains With Pretrained Large Language Models Tom Silver et al.
- Diagnostic Reasoning Prompts Reveal The Potential For Large Language Model Interpretability In Medicine Thomas Savage, Ashwin Nayak, Robert Gallo, Ekanath Rangan, Jonathan H Chen
- Cognitive Architectures For Language Agents Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Open-ended Medical Visual Question Answering Through Prefix Tuning Of Language Models Tom Van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Chatgpt: Beginning Of An End Of Manual Linguistic Data Annotation? Use Case Of Automatic Genre Identification Taja Kuzman, Igor Mozetič, Nikola Ljubešić
- Sparks Of Artificial General Intelligence: Early Experiments With GPT-4 Sébastien Bubeck et al.
- Evallm: Interactive Evaluation Of Large Language Model Prompts On User-defined Criteria Tae Soo Kim, Yoonjoo Lee, Jamin Shin, Young-ho Kim, Juho Kim
- Large Language Models As General Pattern Machines Suvir Mirchandani et al.
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- Analyzing The Performance Of GPT-3.5 And GPT-4 In Grammatical Error Correction Steven Coyne, Keisuke Sakaguchi, Diana Galvan-sosa, Michael Zock, Kentaro Inui
- Pretraining Language Models With Human Preferences Tomasz Korbak et al.
- Chatgpt Perpetuates Gender Bias In Machine Translation And Ignores Non-gendered Pronouns: Findings Across Bengali And Five Other Low-resource Languages Sourojit Ghosh, Aylin Caliskan
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- Expressive Text-to-image Generation With Rich Text Songwei Ge, Taesung Park, Jun-yan Zhu, Jia-bin Huang
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Principled Instructions Are All You Need For Questioning Llama-1/2, GPT-3.5/4 Sondos Mahmoud Bsharat, Aidar Myrzakhan, Zhiqiang Shen
- Llm-empowered Chatbots For Psychiatrist And Patient Simulation: Application And Evaluation Siyuan Chen et al.
- Metagpt: Meta Programming For A Multi-agent Collaborative Framework Sirui Hong et al.
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- LL3DA: Visual Interactive Instruction Tuning For Omni-3d Understanding, Reasoning, And Planning Sijin Chen et al.
- Mariogpt: Open-ended Text2level Generation Through Large Language Models Shyam Sudhakaran et al.
- Tree Of Thoughts: Deliberate Problem Solving With Large Language Models Shunyu Yao et al.
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- Retrieving Supporting Evidence For Generative Question Answering Siqing Huo, Negar Arabzadeh, Charles L. A. Clarke
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Inpars-v2: Large Language Models As Efficient Dataset Generators For Information Retrieval Vitor Jeronymo et al.
- Is GPT-4 A Reliable Rater? Evaluating Consistency In GPT-4 Text Ratings Veronika Hackl, Alexandra Elena Müller, Michael Granitzer, Maximilian Sailer
- Fully Autonomous Programming With Large Language Models Vadim Liventsev, Anastasiia Grishina, Aki Härmä, Leon Moonen
- Automated Reading Passage Generation With Openai's Large Language Model Ummugul Bezirhan, Matthias Von Davier
- Automating Human Tutor-style Programming Feedback: Leveraging GPT-4 Tutor Model For Hint Generation And GPT-3.5 Student Model For Hint Validation Tung Phung et al.
- Freshllms: Refreshing Large Language Models With Search Engine Augmentation Tu Vu et al.
- Better Patching Using LLM Prompting, Via Self-consistency Toufique Ahmed, Premkumar Devanbu
- Automatic Semantic Augmentation Of Language Model Prompts (for Code Summarization) Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
- Promptcblue: A Chinese Prompt Tuning Benchmark For The Medical Domain Wei Zhu, Xiaoling Wang, Huanran Zheng, Mosha Chen, Buzhou Tang
- A Preliminary Evaluation Of Chatgpt For Zero-shot Dialogue Understanding Wenbo Pan, Qiguang Chen, Xiao Xu, Wanxiang Che, Libo Qin
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- BLIVA: A Simple Multimodal LLM For Better Handling Of Text-rich Visual Questions Wenbo Hu et al.
- Roco: Dialectic Multi-robot Collaboration With Large Language Models Zhao Mandi, Shreeya Jain, Shuran Song
- Promptify: Text-to-image Generation Through Interactive Prompt Exploration With Large Language Models Stephen Brade, Bryan Wang, Mauricio Sousa, Sageev Oore, Tovi Grossman
- Chatgpt For PLC/DCS Control Logic Generation Heiko Koziolek, Sten Gruener, Virendra Ashiwal
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine Harsha Nori et al.
- Not All Languages Are Created Equal In Llms: Improving Multilingual Capability By Cross-lingual-thought Prompting Haoyang Huang et al.
- Improved Baselines With Visual Instruction Tuning Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- CMMLU: Measuring Massive Multitask Language Understanding In Chinese Haonan Li et al.
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Reasoning Implicit Sentiment With Chain-of-thought Prompting Hao Fei et al.
- Visual-language Prompt Tuning With Knowledge-guided Context Optimization Hantao Yao, Rui Zhang, Changsheng Xu
- Glamm: Pixel Grounding Large Multimodal Model Hanoona Rasheed et al.
- Llm-rec: Personalized Recommendation Via Prompting Large Language Models Hanjia Lyu et al.
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Prompting Large Language Models For Topic Modeling Han Wang et al.
- Choice Over Control: How Users Write With Large Language Models Using Diegetic And Non-diegetic Prompting Hai Dang, Sven Goller, Florian Lehmann, Daniel Buschek
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Revisiting Large Language Models As Zero-shot Relation Extractors Guozheng Li, Peng Wang, Wenjun Ke
- Gender Bias And Stereotypes In Large Language Models Hadas Kotek, Rikker Dockum, David Q. Sun
- Chatgpt Hallucinates When Attributing Answers Guido Zuccon, Bevan Koopman, Razia Shaik
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Dr Chatgpt, Tell Me What I Want To Hear: How Prompt Knowledge Impacts Health Answer Correctness Guido Zuccon, Bevan Koopman
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Do Large Language Models Show Decision Heuristics Similar To Humans? A Case Study Using GPT-3.5 Gaurav Suri, Lily R. Slater, Ali Ziaee, Morgan Nguyen
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Large Language Models Can Be Easily Distracted By Irrelevant Context Freda Shi et al.
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Exploring Human-like Translation Strategy With Large Language Models Zhiwei He et al.
- Is Chatgpt Better Than Human Annotators? Potential And Limitations Of Chatgpt In Explaining Implicit Hate Speech Fan Huang, Haewoon Kwak, Jisun An
- Learning To Prompt In The Classroom To Understand AI Limits: A Pilot Study Emily Theophilou et al.
- Language Model Crossover: Variation Through Few-shot Prompting Elliot Meyerson et al.
- Assigning AI: Seven Approaches For Students, With Prompts Ethan Mollick, Lilach Mollick
- Llm-adapters: An Adapter Family For Parameter-efficient Fine-tuning Of Large Language Models Zhiqiang Hu et al.
- Simulating H.P. Lovecraft Horror Literature With The Chatgpt Large Language Model Eduardo C. Garrido-merchán, José Luis Arroyo-barrigüete, Roberto Gozalo-brizuela
- Gptutor: A Chatgpt-powered Programming Tool For Code Explanation Eason Chen, Ray Huang, Han-shin Chen, Yuen-hsien Tseng, Liang-yi Li
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- Chatgpt Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions Deyao Zhu et al.
- Evaluating GPT-3.5 And GPT-4 Models On Brazilian University Admission Exams Desnes Nunes, Ricardo Primi, Ramon Pires, Roberto Lotufo, Rodrigo Nogueira
- Using An LLM To Help With Code Understanding Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, Brad Myers
- Text-to-sql Empowered By Large Language Models: A Benchmark Evaluation Dawei Gao et al.
- Have Llms Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models Daman Arora, Himanshu Gaurav Singh, Mausam
- AI And The FCI: Can Chatgpt Project An Understanding Of Introductory Physics? Colin G. West
- LIMA: Less Is More For Alignment Chunting Zhou et al.
- Chatgpt Evaluation On Sentence Level Relations: A Focus On Temporal, Causal, And Discourse Relations Chunkit Chan et al.
- Conversational Automated Program Repair Chunqiu Steven Xia, Lingming Zhang
- Progressive-hint Prompting Improves Reasoning In Large Language Models Chuanyang Zheng, Zhengying Liu, Enze Xie, Zhenguo Li, Yu Li
- Debiasing Vision-language Models Via Biased Prompts Ching-yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka
- Chateval: Towards Better Llm-based Evaluators Through Multi-agent Debate Chi-min Chan et al.
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- Dipping Plms Sauce: Bridging Structure And Text For Effective Knowledge Graph Completion Via Conditional Soft Prompting Chen Chen, Yufei Wang, Aixin Sun, Bing Li, Kwok-yan Lam
- MME: A Comprehensive Evaluation Benchmark For Multimodal Large Language Models Chaoyou Fu et al.
- Generative Speech Recognition Error Correction With Large Language Models And Task-activating Prompting Chao-han Huck Yang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- Llmseceval: A Dataset Of Natural Language Prompts For Security Evaluations Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, Riccardo Scandariato
- Does GPT-4 Pass The Turing Test? Cameron R. Jones, Benjamin K. Bergen
- Receive, Reason, And React: Drive As You Say With Large Language Models In Autonomous Vehicles Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang
- Compositional Chain-of-thought Prompting For Large Multimodal Models Chancharik Mitra, Brandon Huang, Trevor Darrell, Roei Herzig
- Prompting Or Fine-tuning? A Comparative Study Of Large Language Models For Taxonomy Construction Boqi Chen, Fandi Yi, Dániel Varró
- Evaluation Of Chatgpt For Nlp-based Mental Health Applications Bishal Lamichhane
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- Prompting Large Language Model For Machine Translation: A Case Study Biao Zhang, Barry Haddow, Alexandra Birch
- ART: Automatic Multi-step Reasoning And Tool-use For Large Language Models Bhargavi Paranjape et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Can Large Language Models Transform Computational Social Science? Caleb Ziems et al.
- Large Language Models In The Workplace: A Case Study On Prompt Engineering For Job Type Classification Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, Thomas Brightwell
- Friend Or Foe? Exploring The Implications Of Large Language Models On The Science System Benedikt Fecher, Marcel Hebing, Melissa Laufer, Jörg Pohle, Fabian Sofsky
- Check Your Facts And Try Again: Improving Large Language Models With External Knowledge And Automated Feedback Baolin Peng et al.
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- Refactoring Programs Using Large Language Models With Few-shot Examples Atsushi Shirafuji, Yusuke Oda, Jun Suzuki, Makoto Morishita, Yutaka Watanobe
- Exploring The Responses Of Large Language Models To Beginner Programmers' Help Requests Arto Hellas et al.
- Orca 2: Teaching Small Language Models How To Reason Arindam Mitra et al.
- Better Zero-shot Reasoning With Role-play Prompting Aobo Kong et al.
- Universal And Transferable Adversarial Attacks On Aligned Language Models Andy Zou et al.
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- Generative AI: Implications And Applications For Education Anastasia Olnancy Olga et al.
- How Good Are GPT Models At Machine Translation? A Comprehensive Evaluation Amr Hendy et al.
- Robots That Ask For Help: Uncertainty Alignment For Large Language Model Planners Allen Z. Ren et al.
- Model Tuning Or Prompt Tuning? A Study Of Large Language Models For Clinical Concept And Relation Extraction Cheng Peng et al.
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Smoothllm: Defending Large Language Models Against Jailbreaking Attacks Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas
- What Does CLIP Know About A Red Circle? Visual Prompt Engineering For Vlms Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
- Can Chatgpt And Bard Generate Aligned Assessment Items? A Reliability Analysis Against Human Performance Abdolvahab Khademi
- Conversational Ai-powered Design: Chatgpt As Designer, User, And Product A. Baki Kocaballi
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- MM-REACT: Prompting Chatgpt For Multimodal Reasoning And Action Zhengyuan Yang et al.
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- Prompting Large Language Models With Speech Recognition Abilities Yassir Fathullah et al.
- Translating Natural Language To Planning Goals With Large-language Models Yaqi Xie et al.
- RTLLM: An Open-source Benchmark For Design RTL Generation With Large Language Model Yao Lu, Shang Liu, Qijun Zhang, Zhiyao Xie
- Collaborative Large Language Model For Recommender Systems Yaochen Zhu, Liang Wu, Qi Guo, Liangjie Hong, Jundong Li
- 3D-LLM: Injecting The 3D World Into Large Language Models Yining Hong et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- Improving Factuality And Reasoning In Language Models Through Multiagent Debate Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- Human-centric Autonomous Systems With Llms For User Command Reasoning Yi Yang et al.
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- Jailbreaking Chatgpt Via Prompt Engineering: An Empirical Study Yi Liu et al.
- Llm-eval: Unified Multi-dimensional Automatic Evaluation For Open-domain Conversations With Large Language Models Yen-ting Lin, Yun-nung Chen
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- Assessing Cross-cultural Alignment Between Chatgpt And Human Societies: An Empirical Study Yong Cao et al.
- Autotamp: Autoregressive Task And Motion Planning With Llms As Translators And Checkers Yongchao Chen et al.
- When Prompt-based Incremental Learning Does Not Meet Strong Pretraining Yu-ming Tang, Yi-xing Peng, Wei-shi Zheng
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- Llavar: Enhanced Visual Instruction Tuning For Text-rich Image Understanding Yanzhe Zhang et al.
- March In Chat: Interactive Prompting For Remote Embodied Referring Expression Yanyuan Qiao, Yuankai Qi, Zheng Yu, Jing Liu, Qi Wu
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Specializing Smaller Language Models Towards Multi-step Reasoning Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, Tushar Khot
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Speak Foreign Languages With Your Own Voice: Cross-lingual Neural Codec Language Modeling Ziqiang Zhang et al.
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- How Robust Is GPT-3.5 To Predecessors? A Comprehensive Study On Language Understanding Tasks Xuanting Chen et al.
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- How To Unleash The Power Of Large Language Models For Few-shot Relation Extraction? Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
- Delving Into Multimodal Prompting For Fine-grained Visual Classification Xin Jiang et al.
- Don't Trust Chatgpt When Your Question Is Not In English: A Study Of Multilingual Abilities And Types Of Llms Xiang Zhang, Senyu Li, Bradley Hauer, Ning Shi, Grzegorz Kondrak
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events Woosuk Seo, Chanmo Yang, Young-ho Kim
- Language Models Represent Space And Time Wes Gurnee, Max Tegmark
- Is Chatgpt A Good Translator? Yes With GPT-4 As The Engine Wenxiang Jiao et al.
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- Large Language Models Are Zero-shot Rankers For Recommender Systems Yupeng Hou et al.
- Towards Open-world Recommendation With Knowledge Augmentation From Large Language Models Yunjia Xi et al.
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Character-llm: A Trainable Agent For Role-playing Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Is Chatgpt A Good Sentiment Analyzer? A Preliminary Study Zengzhi Wang et al.
- Let The Llms Talk: Simulating Human-to-human Conversational QA Via Zero-shot Llm-to-llm Interactions Zahra Abbasiantaeb, Yifei Yuan, Evangelos Kanoulas, Mohammad Aliannejadi
- Hard Prompts Made Easy: Gradient-based Discrete Optimization For Prompt Tuning And Discovery Yuxin Wen et al.
- Assessing AI Detectors In Identifying Ai-generated Code: Implications For Education Wei Hung Pan et al.
- Contextual AI Journaling: Integrating LLM And Time Series Behavioral Sensing Technology To Promote Self-reflection And Well-being Using The Mindscape App Subigya Nepal et al.
- Who Validates The Validators? Aligning Llm-assisted Evaluation Of LLM Outputs With Human Preferences Shreya Shankar, J. D. Zamfirescu-pereira, Björn Hartmann, Aditya G. Parameswaran, Ian Arawjo
- An Empirical Study On Usage And Perceptions Of Llms In A Software Engineering Project Sanka Rasnayaka, Guanlin Wang, Ridwan Shariffdeen, Ganesh Neelakanta Iyer
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- Large Language Model Capabilities In Perioperative Risk Prediction And Prognostication Philip Chung et al.
- Shaping Human-ai Collaboration: Varied Scaffolding Levels In Co-writing With Language Models Paramveer S. Dhillon et al.
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- CBR-RAG: Case-based Reasoning For Retrieval Augmented Generation In Llms For Legal Question Answering Nirmalie Wiratunga et al.
- Fine-tuned Language Models Generate Stable Inorganic Materials As Text Nate Gruver et al.
- The Effect Of Sampling Temperature On Problem Solving In Large Language Models Matthew Renze, Erhan Guven
- A Piece Of Theatre: Investigating How Teachers Design LLM Chatbots To Assist Adolescent Cyberbullying Education Michael A. Hedderich et al.
- Supporting Sensemaking Of Large Language Model Outputs At Scale Katy Ilonka Gero, Chelse Swoopes, Ziwei Gu, Jonathan K. Kummerfeld, Elena L. Glassman
- Data Is All You Need: Finetuning Llms For Chip Design Via An Automated Design-data Augmentation Framework Kaiyan Chang et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Feedback-generation For Programming Exercises With GPT-4 Imen Azaiz, Natalie Kiesler, Sven Strickroth
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Benchmarking Retrieval-augmented Generation For Medicine Guangzhi Xiong, Qiao Jin, Zhiyong Lu, Aidong Zhang
- Code-aware Prompting: A Study Of Coverage Guided Test Generation In Regression Setting Using LLM Gabriel Ryan et al.
- The Power Of Noise: Redefining Retrieval For RAG Systems Florin Cuconasu et al.
- Embedding Large Language Models Into Extended Reality: Opportunities And Challenges For Inclusion, Engagement, And Privacy Efe Bozkir et al.
- MM1: Methods, Analysis & Insights From Multimodal LLM Pre-training Brandon Mckinzie et al.
- Gemini Goes To Med School: Exploring The Capabilities Of Multimodal Large Language Models On Medical Challenge Problems & Hallucinations Ankit Pal, Malaikannan Sankarasubbu
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- Why And When Llm-based Assistants Can Go Wrong: Investigating The Effectiveness Of Prompt-based Interactions For Software Help-seeking Anjali Khurana, Hari Subramonyam, Parmit K Chilana
- Harnessing Large Language Models For Text-rich Sequential Recommendation Zhi Zheng, Wenshuo Chao, Zhaopeng Qiu, Hengshu Zhu, Hui Xiong
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- How Johnny Can Persuade Llms To Jailbreak Them: Rethinking Persuasion To Challenge AI Safety By Humanizing Llms Yi Zeng et al.
- Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
- Prompting Large Language Models With Rationale Heuristics For Knowledge-based Visual Question Answering Zhongjian Hu, Peng Yang, Bing Li, Fengyuan Liu
- Measurement Of Llm's Philosophies Of Human Nature Minheng Ni et al.
🏷 Pruning
- Sequence-level Knowledge Distillation Yoon Kim, Alexander M. Rush
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Analyzing Multi-head Self-attention: Specialized Heads Do The Heavy Lifting, The Rest Can Be Pruned Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- On The Effect Of Dropping Layers Of Pre-trained Transformer Models Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- Learned Token Pruning For Transformers Sehoon Kim et al.
- Interactive Code Generation Via Test-driven User-intent Formalization Shuvendu K. Lahiri et al.
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Structured Pruning Learns Compact And Accurate Models Mengzhou Xia, Zexuan Zhong, Danqi Chen
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Automatic Prompt Augmentation And Selection With Chain-of-thought From Labeled Data Kashun Shum, Shizhe Diao, Tong Zhang
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- A Simple And Effective Pruning Approach For Large Language Models Mingjie Sun, Zhuang Liu, Anna Bair, J. Zico Kolter
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- A Survey On Model Compression For Large Language Models Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang
- Llm-pruner: On The Structural Pruning Of Large Language Models Xinyin Ma, Gongfan Fang, Xinchao Wang
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
🏷 Quantization
- Fully Quantized Transformer For Machine Translation Gabriele Prato, Ella Charlaix, Mehdi Rezagholizadeh
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Bert-of-theseus: Compressing BERT By Progressive Module Replacing Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- I-BERT: Integer-only BERT Quantization Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
- Smoothquant: Accurate And Efficient Post-training Quantization For Large Language Models Guangxuan Xiao et al.
- LUT-GEMM: Quantized Matrix Multiplication Based On Luts For Efficient Inference In Large-scale Generative Language Models Gunho Park et al.
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Structured Pruning Learns Compact And Accurate Models Mengzhou Xia, Zexuan Zhong, Danqi Chen
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- A Survey On Model Compression And Acceleration For Pretrained Language Models Canwen Xu, Julian Mcauley
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Language Model Compression With Weighted Low-rank Factorization Yen-chang Hsu et al.
- Llm.int8(): 8-bit Matrix Multiplication For Transformers At Scale Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer
- AWQ: Activation-aware Weight Quantization For LLM Compression And Acceleration Ji Lin et al.
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Spqr: A Sparse-quantized Representation For Near-lossless LLM Weight Compression Tim Dettmers et al.
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- A Survey On Model Compression For Large Language Models Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang
- Billm: Pushing The Limit Of Post-training Quantization For Llms Wei Huang et al.
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Biomistral: A Collection Of Open-source Pretrained Large Language Models For Medical Domains Yanis Labrak et al.
🏷 RAG
- Programming With A Differentiable Forth Interpreter Matko Bošnjak, Tim Rocktäschel, Jason Naradowsky, Sebastian Riedel
- Topic Aware Neural Response Generation Chen Xing et al.
- An Actor-critic Algorithm For Sequence Prediction Dzmitry Bahdanau et al.
- A User Simulator For Task-completion Dialogues Xiujun Li et al.
- Separating Answers From Queries For Neural Reading Comprehension Dirk Weissenborn
- Generative Deep Neural Networks For Dialogue: A Short Review Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, Joelle Pineau
- The LAMBADA Dataset: Word Prediction Requiring A Broad Discourse Context Denis Paperno et al.
- Triviaqa: A Large Scale Distantly Supervised Challenge Dataset For Reading Comprehension Mandar Joshi, Eunsol Choi, Daniel S. Weld, Luke Zettlemoyer
- Mojitalk: Generating Emotional Responses At Scale Xianda Zhou, William Yang Wang
- Constructing Datasets For Multi-hop Reading Comprehension Across Documents Johannes Welbl, Pontus Stenetorp, Sebastian Riedel
- Searchqa: A New Q&A Dataset Augmented With Context From A Search Engine Matthew Dunn et al.
- End-to-end Optimization Of Goal-driven And Visually Grounded Dialogue Systems Florian Strub et al.
- A Unified Query-based Generative Model For Question Generation And Question Answering Linfeng Song, Zhiguo Wang, Wael Hamza
- DP-GAN: Diversity-promoting Generative Adversarial Network For Generating Informative And Diversified Text Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu Sun
- Fast Abstractive Summarization With Reinforce-selected Sentence Rewriting Yen-chun Chen, Mohit Bansal
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Improving Machine Reading Comprehension With General Reading Strategies Kai Sun, Dian Yu, Dong Yu, Claire Cardie
- Extending Neural Generative Conversational Model Using External Knowledge Sources Prasanna Parthasarathi, Joelle Pineau
- Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context Urvashi Khandelwal, He He, Peng Qi, Dan Jurafsky
- Advancing The State Of The Art In Open Domain Dialog Systems Through The Alexa Prize Chandra Khatri et al.
- Language Gans Falling Short Massimo Caccia et al.
- Robust Text-to-sql Generation With Execution-guided Decoding Chenglong Wang et al.
- Sdnet: Contextualized Attention-based Deep Network For Conversational Question Answering Chenguang Zhu, Michael Zeng, Xuedong Huang
- An Affect-rich Neural Conversational Model With Biased Attention And Weighted Cross-entropy Loss Peixiang Zhong, Di Wang, Chunyan Miao
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- A Dataset For Document Grounded Conversations Kangyan Zhou, Shrimai Prabhumoye, Alan W Black
- Hybrid Retrieval-generation Reinforced Agent For Medical Image Report Generation Christy Y. Li, Xiaodan Liang, Zhiting Hu, Eric P. Xing
- Retrieval-enhanced Adversarial Training For Neural Response Generation Qingfu Zhu, Lei Cui, Weinan Zhang, Furu Wei, Ting Liu
- Training Tips For The Transformer Model Martin Popel, Ondřej Bojar
- Training Millions Of Personalized Dialogue Agents Pierre-emmanuel Mazaré, Samuel Humeau, Martin Raison, Antoine Bordes
- Emrqa: A Large Corpus For Question Answering On Electronic Medical Records Anusri Pampari, Preethi Raghavan, Jennifer Liang, Jian Peng
- Complex Sequential Question Answering: Towards Learning To Converse Over Linked Question Answer Pairs With A Knowledge Graph Amrita Saha, Vardaan Pahuja, Mitesh M. Khapra, Karthik Sankaranarayanan, Sarath Chandar
- Generating Informative And Diverse Conversational Responses Via Adversarial Information Maximization Yizhe Zhang et al.
- Another Diversity-promoting Objective Function For Neural Dialogue Generation Ryo Nakamura, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Topic-based Evaluation For Conversational Bots Fenfei Guo et al.
- Ranking Paragraphs For Improving Answer Recall In Open-domain Question Answering Jinhyuk Lee, Seongjun Yun, Hyunjae Kim, Miyoung Ko, Jaewoo Kang
- Towards Empathetic Open-domain Conversation Models: A New Benchmark And Dataset Hannah Rashkin, Eric Michael Smith, Margaret Li, Y-lan Boureau
- Simple Fusion: Return Of The Language Model Felix Stahlberg, James Cross, Veselin Stoyanov
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Multi-passage BERT: A Globally Normalized BERT Model For Open-domain Question Answering Zhiguo Wang, Patrick Ng, Xiaofei Ma, Ramesh Nallapati, Bing Xiang
- Cross-lingual Language Model Pretraining Guillaume Lample, Alexis Conneau
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Reqa: An Evaluation For End-to-end Answer Retrieval Models Amin Ahmad, Noah Constant, Yinfei Yang, Daniel Cer
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Transformer-xl: Attentive Language Models Beyond A Fixed-length Context Zihang Dai et al.
- Multiqa: An Empirical Investigation Of Generalization And Transfer In Reading Comprehension Alon Talmor, Jonathan Berant
- Understanding The Behaviors Of BERT In Ranking Yifan Qiao, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Contextualized Sparse Representations For Real-time Open-domain Question Answering Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang
- Answering Complex Open-domain Questions Through Iterative Query Generation Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, Christopher D. Manning
- Mixture Content Selection For Diverse Sequence Generation Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
- Cross-lingual Natural Language Generation Via Pre-training Zewen Chi et al.
- Good-enough Compositional Data Augmentation Jacob Andreas
- Neural Assistant: Joint Action Prediction, Response Generation, And Latent Knowledge Reasoning Arvind Neelakantan et al.
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Multi-step Retriever-reader Interaction For Scalable Open-domain Question Answering Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew Mccallum
- Pythia: Ai-assisted Code Completion System Alexey Svyatkovskiy, Ying Zhao, Shengyu Fu, Neel Sundaresan
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Learning To Retrieve Reasoning Paths Over Wikipedia Graph For Question Answering Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, Caiming Xiong
- Sentence-level Content Planning And Style Specification For Neural Text Generation Xinyu Hua, Lu Wang
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Incorporating External Knowledge Into Machine Reading For Generative Question Answering Bin Bi et al.
- Modeling Recurrence For Transformer Jie Hao et al.
- Rankqa: Neural Question Answering With Answer Re-ranking Bernhard Kratzwald, Anna Eigenmann, Stefan Feuerriegel
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Codegru: Context-aware Deep Learning With Gated Recurrent Unit For Source Code Modeling Yasir Hussain, Zhiqiu Huang, Yu Zhou, Senzhang Wang
- Learning From Explanations With Neural Execution Tree Ziqi Wang et al.
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Reinforced Dynamic Reasoning For Conversational Question Generation Boyuan Pan, Hao Li, Ziyu Yao, Deng Cai, Huan Sun
- A Simple But Effective Method To Incorporate Multi-turn Context With BERT For Conversational Machine Comprehension Yasuhito Ohsugi, Itsumi Saito, Kyosuke Nishida, Hisako Asano, Junji Tomita
- Enabling Robots To Understand Incomplete Natural Language Instructions Using Commonsense Reasoning Haonan Chen, Hao Tan, Alan Kuntz, Mohit Bansal, Ron Alterovitz
- Personalizing Dialogue Agents Via Meta-learning Zhaojiang Lin, Andrea Madotto, Chien-sheng Wu, Pascale Fung
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Paraphrasing With Large Language Models Sam Witteveen, Martin Andrews
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Knowledge Aware Conversation Generation With Explainable Reasoning Over Augmented Graphs Zhibin Liu, Zheng-yu Niu, Hua Wu, Haifeng Wang
- Retrieve, Read, Rerank: Towards End-to-end Multi-document Reading Comprehension Minghao Hu, Yuxing Peng, Zhen Huang, Dongsheng Li
- Learning And Evaluating General Linguistic Intelligence Dani Yogatama et al.
- Dialogpt: Large-scale Generative Pre-training For Conversational Response Generation Yizhe Zhang et al.
- 12-in-1: Multi-task Vision And Language Representation Learning Jiasen Lu, Vedanuj Goswami, Marcus Rohrbach, Devi Parikh, Stefan Lee
- Improving Neural Response Diversity With Frequency-aware Cross-entropy Loss Shaojie Jiang, Pengjie Ren, Christof Monz, Maarten De Rijke
- Inducing Brain-relevant Bias In Natural Language Processing Models Dan Schwartz, Mariya Toneva, Leila Wehbe
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- Attentive History Selection For Conversational Question Answering Chen Qu et al.
- Unsupervised Question Answering By Cloze Translation Patrick Lewis, Ludovic Denoyer, Sebastian Riedel
- Linguistic Knowledge And Transferability Of Contextual Representations Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
- MLQA: Evaluating Cross-lingual Extractive Question Answering Patrick Lewis, Barlas Oğuz, Ruty Rinott, Sebastian Riedel, Holger Schwenk
- Jointly Optimizing Diversity And Relevance In Neural Response Generation Xiang Gao et al.
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Cosmos QA: Machine Reading Comprehension With Contextual Commonsense Reasoning Lifu Huang, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Learning To Select Knowledge For Response Generation In Dialog Systems Rongzhong Lian, Min Xie, Fan Wang, Jinhua Peng, Hua Wu
- Using Natural Language For Reward Shaping In Reinforcement Learning Prasoon Goyal, Scott Niekum, Raymond J. Mooney
- Gmail Smart Compose: Real-time Assisted Writing Mia Xu Chen et al.
- Real-time Open-domain Question Answering With Dense-sparse Phrase Index Minjoon Seo et al.
- Explain Yourself! Leveraging Language Models For Commonsense Reasoning Nazneen Fatema Rajani, Bryan Mccann, Caiming Xiong, Richard Socher
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Context-aware Learning For Neural Machine Translation Sébastien Jean, Kyunghyun Cho
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Text Infilling Wanrong Zhu, Zhiting Hu, Eric Xing
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- Pretrained Encyclopedia: Weakly Supervised Knowledge-pretrained Language Model Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Attention-informed Mixed-language Training For Zero-shot Cross-lingual Task-oriented Dialogue Systems Zihan Liu, Genta Indra Winata, Zhaojiang Lin, Peng Xu, Pascale Fung
- UNIMO: Towards Unified-modal Understanding And Generation Via Cross-modal Contrastive Learning Wei Li et al.
- Unsupervised Paraphrase Generation Using Pre-trained Language Models Chaitra Hegde, Shrikumar Patil
- Measuring Systematic Generalization In Neural Proof Generation With Transformers Nicolas Gontier, Koustuv Sinha, Siva Reddy, Christopher Pal
- Sequential Latent Knowledge Selection For Knowledge-grounded Dialogue Byeongchang Kim, Jaewoo Ahn, Gunhee Kim
- Like Hiking? You Probably Enjoy Nature: Persona-grounded Dialog With Commonsense Expansions Bodhisattwa Prasad Majumder, Harsh Jhamtani, Taylor Berg-kirkpatrick, Julian Mcauley
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- Fine-tuning Pretrained Language Models: Weight Initializations, Data Orders, And Early Stopping Jesse Dodge et al.
- Towards A Human-like Open-domain Chatbot Daniel Adiwardana et al.
- Doc2dial: A Goal-oriented Document-grounded Dialogue Dataset Song Feng et al.
- Conversational Question Reformulation Via Sequence-to-sequence Architectures And Pretrained Language Models Sheng-chieh Lin et al.
- Intermediate-task Transfer Learning With Pretrained Models For Natural Language Understanding: When And Why Does It Work? Yada Pruksachatkun et al.
- WT5?! Training Text-to-text Models To Explain Their Predictions Sharan Narang et al.
- CG-BERT: Conditional Text Generation With BERT For Generalized Few-shot Intent Detection Congying Xia, Chenwei Zhang, Hoang Nguyen, Jiawei Zhang, Philip Yu
- Knowledge-aware Language Model Pretraining Corby Rosset et al.
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- Exploring Fine-tuning Techniques For Pre-trained Cross-lingual Models Via Continual Learning Zihan Liu, Genta Indra Winata, Andrea Madotto, Pascale Fung
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- Gpt-too: A Language-model-first Approach For Amr-to-text Generation Manuel Mager et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- KG-BART: Knowledge Graph-augmented BART For Generative Commonsense Reasoning Ye Liu, Yao Wan, Lifang He, Hao Peng, Philip S. Yu
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Knowledge-grounded Dialogue Generation With Pre-trained Language Models Xueliang Zhao et al.
- Explaining Question Answering Models Through Text Generation Veronica Latcinnik, Jonathan Berant
- Sequence-level Mixed Sample Data Augmentation Demi Guo, Yoon Kim, Alexander M. Rush
- Learning To Recombine And Resample Data For Compositional Generalization Ekin Akyürek, Afra Feyza Akyürek, Jacob Andreas
- Inducing Language-agnostic Multilingual Representations Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein
- Logic2text: High-fidelity Natural Language Generation From Logical Forms Zhiyu Chen et al.
- M3P: Learning Universal Representations Via Multitask Multilingual Multimodal Pre-training Minheng Ni et al.
- A Simple Language Model For Task-oriented Dialogue Ehsan Hosseini-asl, Bryan Mccann, Chien-sheng Wu, Semih Yavuz, Richard Socher
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- PONE: A Novel Automatic Evaluation Metric For Open-domain Generative Dialogue Systems Tian Lan, Xian-ling Mao, Wei Wei, Xiaoyan Gao, Heyan Huang
- Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-initiative Conversations Ashwin Paranjape et al.
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- MART: Memory-augmented Recurrent Transformer For Coherent Video Paragraph Captioning Jie Lei et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Better Fine-tuning By Reducing Representational Collapse Armen Aghajanyan et al.
- Generative Data Augmentation For Commonsense Reasoning Yiben Yang et al.
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Adapterdrop: On The Efficiency Of Adapters In Transformers Andreas Rücklé et al.
- Natural Language Rationales With Full-stack Visual Reasoning: From Pixels To Semantic Frames To Commonsense Graphs Ana Marasović et al.
- Ernie-doc: A Retrospective Long-document Modeling Transformer Siyu Ding et al.
- Non-autoregressive Machine Translation With Disentangled Context Transformer Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Intellicode Compose: Code Generation Using Transformer Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan
- The Effect Of Natural Distribution Shift On Question Answering Models John Miller, Karl Krauth, Benjamin Recht, Ludwig Schmidt
- Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training Weizhen Qi et al.
- Logic-guided Data Augmentation And Regularization For Consistent Question Answering Akari Asai, Hannaneh Hajishirzi
- Grounding Language To Autonomously-acquired Skills Via Goal Generation Ahmed Akakzia, Cédric Colas, Pierre-yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
- Mixkd: Towards Efficient Distillation Of Large-scale Language Models Kevin J Liang et al.
- Look Before You Speak: Visually Contextualized Utterances Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid
- Retrieval-augmented Generation For Knowledge-intensive NLP Tasks Patrick Lewis et al.
- Asking Questions The Human Way: Scalable Question-answer Generation From Text Corpus Bang Liu, Haojie Wei, Di Niu, Haolan Chen, Yancheng He
- Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue Shikib Mehri, Mihail Eric, Dilek Hakkani-tur
- XTREME: A Massively Multilingual Multi-task Benchmark For Evaluating Cross-lingual Generalization Junjie Hu et al.
- Simplifying Paragraph-level Question Generation Via Transformer Language Models Luis Enrico Lopez, Diane Kathryn Cruz, Jan Christian Blaise Cruz, Charibeth Cheng
- Assessing Phrasal Representation And Composition In Transformers Lang Yu, Allyson Ettinger
- Leveraging Passage Retrieval With Generative Models For Open Domain Question Answering Gautier Izacard, Edouard Grave
- Nearest Neighbor Machine Translation Urvashi Khandelwal, Angela Fan, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis
- X-FACTR: Multilingual Factual Knowledge Retrieval From Pretrained Language Models Zhengbao Jiang, Antonios Anastasopoulos, Jun Araki, Haibo Ding, Graham Neubig
- Cosda-ml: Multi-lingual Code-switching Data Augmentation For Zero-shot Cross-lingual NLP Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che
- Probing Pretrained Language Models For Lexical Semantics Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
- Trading Off Diversity And Quality In Natural Language Generation Hugh Zhang, Daniel Duckworth, Daphne Ippolito, Arvind Neelakantan
- How Fine Can Fine-tuning Be? Learning Efficient Language Models Evani Radiya-dixit, Xin Wang
- Robust Encodings: A Framework For Combating Adversarial Typos Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang
- Turngpt: A Transformer-based Language Model For Predicting Turn-taking In Spoken Dialog Erik Ekstedt, Gabriel Skantze
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- CERT: Contrastive Self-supervised Learning For Language Understanding Hongchao Fang, Sicheng Wang, Meng Zhou, Jiayuan Ding, Pengtao Xie
- Multilingual Translation With Extensible Multilingual Pretraining And Finetuning Yuqing Tang et al.
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Residual Energy-based Models For Text Generation Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'aurelio Ranzato
- Aragpt2: Pre-trained Transformer For Arabic Language Generation Wissam Antoun, Fady Baly, Hazem Hajj
- Openvidial: A Large-scale, Open-domain Dialogue Dataset With Visual Contexts Yuxian Meng et al.
- Constructing A Multi-hop QA Dataset For Comprehensive Evaluation Of Reasoning Steps Xanh Ho, Anh-khoa Duong Nguyen, Saku Sugawara, Akiko Aizawa
- DAVE: Deriving Automatically Verilog From English Hammond Pearce, Benjamin Tan, Ramesh Karri
- Indic-transformers: An Analysis Of Transformer Language Models For Indian Languages Kushal Jain, Adwait Deshpande, Kumar Shridhar, Felix Laumann, Ayushman Dash
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Ruibo Liu et al.
- Delight: Deep And Light-weight Transformer Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
- Mt5: A Massively Multilingual Pre-trained Text-to-text Transformer Linting Xue et al.
- Will I Sound Like Me? Improving Persona Consistency In Dialogues Through Pragmatic Self-consciousness Hyunwoo Kim, Byeongchang Kim, Gunhee Kim
- As Good As New. How To Successfully Recycle English GPT-2 To Make Models For Other Languages Wietse De Vries, Malvina Nissim
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- Retrofitting Structure-aware Transformer Language Model For End Tasks Hao Fei, Yafeng Ren, Donghong Ji
- Generate Natural Language Explanations For Recommendation Hanxiong Chen, Xu Chen, Shaoyun Shi, Yongfeng Zhang
- Increasing Faithfulness In Knowledge-grounded Dialogue With Controllable Features Hannah Rashkin, David Reitter, Gaurav Singh Tomar, Dipanjan Das
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- Codified Audio Language Modeling Learns Useful Representations For Music Information Retrieval Rodrigo Castellon, Chris Donahue, Percy Liang
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- P-tuning V2: Prompt Tuning Can Be Comparable To Fine-tuning Universally Across Scales And Tasks Xiao Liu et al.
- Contrastive Learning For Many-to-many Multilingual Neural Machine Translation Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
- Defending Against Backdoor Attacks In Natural Language Generation Xiaofei Sun et al.
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- Reframing Instructional Prompts To Gptk's Language Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
- Crossing The Conversational Chasm: A Primer On Natural Language Processing For Multilingual Task-oriented Dialogue Systems Evgeniia Razumovskaia et al.
- True Few-shot Learning With Language Models Ethan Perez, Douwe Kiela, Kyunghyun Cho
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- Prompt Programming For Large Language Models: Beyond The Few-shot Paradigm Laria Reynolds, Kyle Mcdonell
- Unsupervised Corpus Aware Language Model Pre-training For Dense Passage Retrieval Luyu Gao, Jamie Callan
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Bitod: A Bilingual Multi-domain Dataset For Task-oriented Dialogue Modeling Zhaojiang Lin et al.
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- KAT: A Knowledge Augmented Transformer For Vision-and-language Liangke Gui et al.
- One Chatbot Per Person: Creating Personalized Chatbots Based On Implicit User Profiles Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, Ji-rong Wen
- Conversational Question Answering Over Knowledge Graphs With Transformer And Graph Attention Networks Endri Kacupaj et al.
- A Recipe For Arbitrary Text Style Transfer With Large Language Models Emily Reif et al.
- Parallel Refinements For Lexically Constrained Text Generation With BART Xingwei He
- How Should Pre-trained Language Models Be Fine-tuned Towards Adversarial Robustness? Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang
- End-to-end Training Of Multi-document Reader And Retriever For Open-domain Question Answering Devendra Singh Sachan, Siva Reddy, William Hamilton, Chris Dyer, Dani Yogatama
- Improving And Simplifying Pattern Exploiting Training Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, Colin Raffel
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- UC2: Universal Cross-lingual Cross-modal Vision-and-language Pre-training Mingyang Zhou et al.
- PAQ: 65 Million Probably-asked Questions And What You Can Do With Them Patrick Lewis et al.
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- What To Pre-train On? Efficient Intermediate Task Selection Clifton Poth, Jonas Pfeiffer, Andreas Rücklé, Iryna Gurevych
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Meta-learning Via Language Model In-context Tuning Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He
- Empowering News Recommendation With Pre-trained Language Models Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang
- Structurallm: Structural Pre-training For Form Understanding Chenliang Li et al.
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- Efficient Retrieval Augmented Generation From Unstructured Knowledge For Task-oriented Dialog David Thulke, Nico Daheim, Christian Dugast, Hermann Ney
- Generating Datasets With Pretrained Language Models Timo Schick, Hinrich Schütze
- RAFT: A Real-world Few-shot Text Classification Benchmark Neel Alex et al.
- Non-invasive Self-attention For Side Information Fusion In Sequential Recommendation Chang Liu et al.
- Simvlm: Simple Visual Language Model Pretraining With Weak Supervision Zirui Wang et al.
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- Climatebert: A Pretrained Language Model For Climate-related Text Nicolas Webersinke, Mathias Kraus, Julia Anna Bingler, Markus Leippold
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- A Short Survey Of Pre-trained Language Models For Conversational AI-A Newage In NLP Munazza Zaib, Quan Z. Sheng, Wei Emma Zhang
- Vision Guided Generative Pre-trained Language Models For Multimodal Abstractive Summarization Tiezheng Yu, Wenliang Dai, Zihan Liu, Pascale Fung
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- Neural Path Hunter: Reducing Hallucination In Dialogue Systems Via Path Grounding Nouha Dziri, Andrea Madotto, Osmar Zaiane, Avishek Joey Bose
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- Medically Aware GPT-3 As A Data Generator For Medical Dialogue Summarization Bharath Chintagunta, Namit Katariya, Xavier Amatriain, Anitha Kannan
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- AI Chains: Transparent And Controllable Human-ai Interaction By Chaining Large Language Model Prompts Tongshuang Wu, Michael Terry, Carrie J. Cai
- Multilingual LAMA: Investigating Knowledge In Multilingual Pretrained Language Models Nora Kassner, Philipp Dufter, Hinrich Schütze
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- Math Word Problem Generation With Mathematical Consistency And Problem Context Constraints Zichao Wang, Andrew S. Lan, Richard G. Baraniuk
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- Denseclip: Language-guided Dense Prediction With Context-aware Prompting Yongming Rao et al.
- COCO-LM: Correcting And Contrasting Text Sequences For Language Model Pretraining Yu Meng et al.
- Baleen: Robust Multi-hop Reasoning At Scale Via Condensed Retrieval Omar Khattab, Christopher Potts, Matei Zaharia
- An Empirical Study Of GPT-3 For Few-shot Knowledge-based VQA Zhengyuan Yang et al.
- Task-oriented Dialogue System As Natural Language Generation Weizhi Wang et al.
- Episodic Transformer For Vision-and-language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- KM-BART: Knowledge Enhanced Multimodal BART For Visual Commonsense Generation Yiran Xing et al.
- How Many Data Points Is A Prompt Worth? Teven Le Scao, Alexander M. Rush
- Towards Retrieval-based Conversational Recommendation Ahtsham Manzoor, Dietmar Jannach
- Bertese: Learning To Speak To BERT Adi Haviv, Jonathan Berant, Amir Globerson
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- Challenges In Detoxifying Language Models Johannes Welbl et al.
- Rome Was Built In 1776: A Case Study On Factual Correctness In Knowledge-grounded Response Generation Sashank Santhanam et al.
- Beyond Goldfish Memory: Long-term Open-domain Conversation Jing Xu, Arthur Szlam, Jason Weston
- Rethink Training Of BERT Rerankers In Multi-stage Retrieval Pipeline Luyu Gao, Zhuyun Dai, Jamie Callan
- Taming Sparsely Activated Transformer With Stochastic Experts Simiao Zuo et al.
- Improving Question Answering Model Robustness With Synthetic Adversarial Data Generation Max Bartolo et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- What Makes Good In-context Examples For GPT-\(3\)? Jiachang Liu et al.
- Compm: Context Modeling With Speaker's Pre-trained Memory Tracking For Emotion Recognition In Conversation Joosung Lee, Wooin Lee
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- Hurdles To Progress In Long-form Question Answering Kalpesh Krishna, Aurko Roy, Mohit Iyyer
- Learning To Prompt For Vision-language Models Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
- Few-shot Knowledge Graph-to-text Generation With Pretrained Language Models Junyi Li et al.
- Quality: Question Answering With Long Input Texts, Yes! Richard Yuanzhe Pang et al.
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Raise A Child In Large Language Model: Towards Effective And Generalizable Fine-tuning Runxin Xu et al.
- FILM: Following Instructions In Language With Modular Methods So Yeon Min, Devendra Singh Chaplot, Pradeep Ravikumar, Yonatan Bisk, Ruslan Salakhutdinov
- Diverse Demonstrations Improve In-context Compositional Generalization Itay Levy, Ben Bogin, Jonathan Berant
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- Scaling Instruction-finetuned Language Models Hyung Won Chung et al.
- Reacc: A Retrieval-augmented Code Completion Framework Shuai Lu et al.
- Interactive Code Generation Via Test-driven User-intent Formalization Shuvendu K. Lahiri et al.
- A Survey On Retrieval-augmented Text Generation Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
- Robotic Skill Acquisition Via Instruction Augmentation With Vision-language Models Ted Xiao et al.
- One Embedder, Any Task: Instruction-finetuned Text Embeddings Hongjin Su et al.
- Selective Annotation Makes Language Models Better Few-shot Learners Hongjin Su et al.
- Inner Monologue: Embodied Reasoning Through Planning With Language Models Wenlong Huang et al.
- Murag: Multimodal Retrieval-augmented Generator For Open Question Answering Over Images And Text Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen
- Few-shot Parameter-efficient Fine-tuning Is Better And Cheaper Than In-context Learning Haokun Liu et al.
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- Program Of Thoughts Prompting: Disentangling Computation From Reasoning For Numerical Reasoning Tasks Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen
- In-context Learning For Few-shot Dialogue State Tracking Yushi Hu et al.
- Decoupling Knowledge From Memorization: Retrieval-augmented Prompt Learning Xiang Chen et al.
- Less Is More: Learning To Refine Dialogue History For Personalized Dialogue Generation Hanxun Zhong, Zhicheng Dou, Yutao Zhu, Hongjin Qian, Ji-rong Wen
- Pali: A Jointly-scaled Multilingual Language-image Model Xi Chen et al.
- How To Prompt? Opportunities And Challenges Of Zero- And Few-shot Learning For Human-ai Interaction In Creative Applications Of Generative Models Hai Dang, Lukas Mecke, Florian Lehmann, Sven Goller, Daniel Buschek
- Multi-stage Prompting For Knowledgeable Dialogue Generation Zihan Liu et al.
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Evaluating And Inducing Personality In Pre-trained Language Models Guangyuan Jiang et al.
- Recitation-augmented Language Models Zhiqing Sun, Xuezhi Wang, Yi Tay, Yiming Yang, Denny Zhou
- Dialfred: Dialogue-enabled Agents For Embodied Instruction Following Xiaofeng Gao et al.
- Promptcap: Prompt-guided Task-aware Image Captioning Yushi Hu et al.
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- Hitskt: A Hierarchical Transformer Model For Session-aware Knowledge Tracing Fucai Ke et al.
- Large Language Models Encode Clinical Knowledge Karan Singhal et al.
- Speechprompt: An Exploration Of Prompt Tuning On Generative Spoken Language Model For Speech Processing Tasks Kai-wei Chang, Wei-cheng Tseng, Shang-wen Li, Hung-yi Lee
- BLIP: Bootstrapping Language-image Pre-training For Unified Vision-language Understanding And Generation Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Leveraging Large Language Models For Multiple Choice Question Answering Joshua Robinson, Christopher Michael Rytting, David Wingate
- Training Compute-optimal Large Language Models Jordan Hoffmann et al.
- Can Large Language Models Truly Understand Prompts? A Case Study With Negated Prompts Joel Jang, Seonghyeon Ye, Minjoon Seo
- Vision-language Pre-training With Triple Contrastive Learning Jinyu Yang et al.
- Language Models (mostly) Know What They Know Saurav Kadavath et al.
- RASAT: Integrating Relational Structures Into Pretrained Seq2seq Model For Text-to-sql Jiexing Qi et al.
- Measuring Progress On Scalable Oversight For Large Language Models Samuel R. Bowman et al.
- Knowledge Prompting In Pre-trained Language Model For Natural Language Understanding Jianing Wang et al.
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- Improving The Domain Adaptation Of Retrieval Augmented Generation (RAG) Models For Open Domain Question Answering Shamane Siriwardhana et al.
- Adapting Pre-trained Language Models To African Languages Via Multilingual Adaptive Fine-tuning Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow
- Revisiting The "video" In Video-language Understanding Shyamal Buch et al.
- A Fine-grained Comparison Of Pragmatic Language Understanding In Humans And Language Models Jennifer Hu, Sammy Floyd, Olessia Jouravlev, Evelina Fedorenko, Edward Gibson
- Action-gpt: Leveraging Large-scale Language Models For Improved And Generalized Action Generation Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla
- Neural Theory-of-mind? On The Limits Of Social Intelligence In Large Lms Maarten Sap, Ronan Lebras, Daniel Fried, Yejin Choi
- Prompting Is Programming: A Query Language For Large Language Models Luca Beurer-kellner, Marc Fischer, Martin Vechev
- Real Or Fake Text?: Investigating Human Ability To Detect Boundaries Between Human-written And Machine-generated Text Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Sherry Shi, Chris Callison-burch
- Data Distributional Properties Drive Emergent In-context Learning In Transformers Stephanie C. Y. Chan et al.
- The Goldilocks Of Pragmatic Understanding: Fine-tuning Strategy Matters For Implicature Resolution By Llms Laura Ruis et al.
- Distilling Reasoning Capabilities Into Smaller Language Models Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- Do As I Can, Not As I Say: Grounding Language In Robotic Affordances Michael Ahn et al.
- Efficient Long-text Understanding With Short-text Models Maor Ivgi, Uri Shaham, Jonathan Berant
- Can Large Language Models Reason About Medical Questions? Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther
- Visual Prompt Tuning Menglin Jia et al.
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Can Machines Help Us Answering Question 16 In Datasheets, And In Turn Reflecting On Inappropriate Content? Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
- Prompt Distribution Learning Yuning Lu, Jianzhuang Liu, Yonggang Zhang, Yajing Liu, Xinmei Tian
- Memory-based Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn
- Star: Bootstrapping Reasoning With Reasoning Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman
- Compilable Neural Code Generation With Compiler Feedback Xin Wang et al.
- IGLUE: A Benchmark For Transfer Learning Across Modalities, Tasks, And Languages Emanuele Bugliarello et al.
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- Successive Prompting For Decomposing Complex Questions Dheeru Dua, Shivanshu Gupta, Sameer Singh, Matt Gardner
- Visual-language Navigation Pretraining Via Prompt-based Environmental Self-exploration Xiwen Liang, Fengda Zhu, Lingling Li, Hang Xu, Xiaodan Liang
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- Block-recurrent Transformers Delesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur
- Future Transformer For Long-term Action Anticipation Dayoung Gong, Joonseok Lee, Manjin Kim, Seong Jong Ha, Minsu Cho
- Neural Pipeline For Zero-shot Data-to-text Generation Zdeněk Kasner, Ondřej Dušek
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Discovering Latent Knowledge In Language Models Without Supervision Collin Burns, Haotian Ye, Dan Klein, Jacob Steinhardt
- Putting Gpt-3's Creativity To The (alternative Uses) Test Claire Stevenson, Iris Smal, Matthijs Baas, Raoul Grasman, Han Van Der Maas
- Learning Video Representations From Large Language Models Yue Zhao, Ishan Misra, Philipp Krähenbühl, Rohit Girdhar
- Augesc: Dialogue Augmentation With Large Language Models For Emotional Support Conversation Chujie Zheng, Sahand Sabour, Jiaxin Wen, Zheng Zhang, Minlie Huang
- Competition-level Code Generation With Alphacode Yujia Li et al.
- Prompt For Extraction? PAIE: Prompting Argument Interaction For Event Argument Extraction Yubo Ma et al.
- A Unified End-to-end Retriever-reader Framework For Knowledge-based VQA Yangyang Guo et al.
- Self-consistency Improves Chain Of Thought Reasoning In Language Models Xuezhi Wang et al.
- Code4struct: Code Generation For Few-shot Event Structure Prediction Xingyao Wang, Sha Li, Heng Ji
- Scaling Language-image Pre-training Via Masking Yanghao Li, Haoqi Fan, Ronghang Hu, Christoph Feichtenhofer, Kaiming He
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- Complexity-based Prompting For Multi-step Reasoning Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, Tushar Khot
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- Impact Of Pretraining Term Frequencies On Few-shot Reasoning Yasaman Razeghi, Robert L. Iv Logan, Matt Gardner, Sameer Singh
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- Codet: Code Generation With Generated Tests Bei Chen et al.
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Retrieval Augmentation Of Large Language Models For Lay Language Generation Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- Automatic Chain Of Thought Prompting In Large Language Models Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola
- Generative Language Models For Paragraph-level Question Generation Asahi Ushio, Fernando Alva-manchego, Jose Camacho-collados
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Grips: Gradient-free, Edit-based Instruction Search For Prompting Large Language Models Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Socratic Models: Composing Zero-shot Multimodal Reasoning With Language Andy Zeng et al.
- TIARA: Multi-grained Retrieval For Robust Question Answering Over Large Knowledge Bases Yiheng Shu et al.
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- Standing On The Shoulders Of Giant Frozen Language Models Yoav Levine et al.
- WANLI: Worker And AI Collaboration For Natural Language Inference Dataset Creation Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi
- When Not To Trust Language Models: Investigating Effectiveness Of Parametric And Non-parametric Memories Alex Mallen et al.
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- REVEAL: Retrieval-augmented Visual-language Pre-training With Multi-source Multimodal Knowledge Memory Ziniu Hu et al.
- A New Path: Scaling Vision-and-language Navigation With Synthetic Instructions And Imitation Learning Aishwarya Kamath et al.
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- Retrieval-augmented Generative Question Answering For Event Argument Extraction Xinya Du, Heng Ji
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- ROSCOE: A Suite Of Metrics For Scoring Step-by-step Reasoning Olga Golovneva et al.
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- Learning To Compose Soft Prompts For Compositional Zero-shot Learning Nihal V. Nayak, Peilin Yu, Stephen H. Bach
- Unifiedskg: Unifying And Multi-tasking Structured Knowledge Grounding With Text-to-text Language Models Tianbao Xie et al.
- Vl-checklist: Evaluating Pre-trained Vision-language Models With Objects, Attributes And Relations Tiancheng Zhao et al.
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- Maple: Multi-modal Prompt Learning Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan
- Talking About Large Language Models Murray Shanahan
- Transformer Feed-forward Layers Build Predictions By Promoting Concepts In The Vocabulary Space Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
- Challenging Big-bench Tasks And Whether Chain-of-thought Can Solve Them Mirac Suzgun et al.
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- Training And Evaluating A Jupyter Notebook Data Science Assistant Shubham Chandel, Colin B. Clement, Guillermo Serrato, Neel Sundaresan
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Instructdial: Improving Zero And Few-shot Generalization In Dialogue Through Instruction Tuning Prakhar Gupta et al.
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Conversational Question Answering On Heterogeneous Sources Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum
- Holistic Evaluation Of Language Models Percy Liang et al.
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Nl2spec: Interactively Translating Unstructured Natural Language To Temporal Logics With Large Language Models Matthias Cosler, Christopher Hahn, Daniel Mendoza, Frederik Schmitt, Caroline Trippel
- Large Language Models Effectively Leverage Document-level Context For Literary Translation, But Critical Errors Persist Marzena Karpinska, Mohit Iyyer
- Flexkbqa: A Flexible Llm-powered Framework For Few-shot Knowledge Base Question Answering Zhenyu Li et al.
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Llm-grounded Diffusion: Enhancing Prompt Understanding Of Text-to-image Diffusion Models With Large Language Models Long Lian, Boyi Li, Adam Yala, Trevor Darrell
- Practical And Ethical Challenges Of Large Language Models In Education: A Systematic Scoping Review Lixiang Yan et al.
- Enhancing Few-shot Text-to-sql Capabilities Of Large Language Models: A Study On Prompt Design Strategies Linyong Nan et al.
- From Word Models To World Models: Translating From Natural Language To The Probabilistic Language Of Thought Lionel Wong et al.
- Taiyi: A Bilingual Fine-tuned Large Language Model For Diverse Biomedical Tasks Ling Luo et al.
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- Improving CLIP Training With Language Rewrites Lijie Fan, Dilip Krishnan, Phillip Isola, Dina Katabi, Yonglong Tian
- Automatically Correcting Large Language Models: Surveying The Landscape Of Diverse Self-correction Strategies Liangming Pan et al.
- Logic-lm: Empowering Large Language Models With Symbolic Solvers For Faithful Logical Reasoning Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Improving Text Embeddings With Large Language Models Liang Wang et al.
- A Survey On Hallucination In Large Language Models: Principles, Taxonomy, Challenges, And Open Questions Lei Huang et al.
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Surgicalgpt: End-to-end Language-vision GPT For Visual Question Answering In Surgery Lalithkumar Seenivasan, Mobarakol Islam, Gokul Kannan, Hongliang Ren
- Just Tell Me: Prompt Engineering In Business Process Management Kiran Busch, Alexander Rochlitzer, Diana Sola, Henrik Leopold
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- News Verifiers Showdown: A Comparative Performance Evaluation Of Chatgpt 3.5, Chatgpt 4.0, Bing AI, And Bard In News Fact-checking Kevin Matthe Caramancion
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Waffling Around For Performance: Visual Classification With Random Words And Broad Concepts Karsten Roth et al.
- Topical-chat: Towards Knowledge-grounded Open-domain Conversations Karthik Gopalakrishnan et al.
- Mvp: Multi-view Prompting Improves Aspect Sentiment Tuple Prediction Zhibin Gou, Qingyan Guo, Yujiu Yang
- Towards Expert-level Medical Question Answering With Large Language Models Karan Singhal et al.
- Speechprompt V2: Prompt Tuning For Speech Classification Tasks Kai-wei Chang et al.
- Full Parameter Fine-tuning For Large Language Models With Limited Resources Kai Lv et al.
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- Evaluation And Analysis Of Hallucination In Large Vision-language Models Junyang Wang et al.
- Ai-augmented Surveys: Leveraging Large Language Models And Surveys For Opinion Prediction Junsol Kim, Byungkyu Lee
- Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, Chun Zuo
- Chatcounselor: A Large Language Models For Mental Health Support June M. Liu et al.
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Jatmo: Prompt Injection Defense By Task-specific Finetuning Julien Piet et al.
- Exploring The Benefits Of Training Expert Language Models Over Instruction Tuning Joel Jang et al.
- Generating Images With Multimodal Language Models Jing Yu Koh, Daniel Fried, Ruslan Salakhutdinov
- Grounding Language Models To Images For Multimodal Inputs And Outputs Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried
- When Large Language Models Meet Personalization: Perspectives Of Challenges And Opportunities Jin Chen et al.
- Prompt-and-align: Prompt-based Social Alignment For Few-shot Fake News Detection Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
- Benchmarking Large Language Models In Retrieval-augmented Generation Jiawei Chen, Hongyu Lin, Xianpei Han, Le Sun
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- On Decoder-only Architecture For Speech-to-text And Large Language Model Integration Jian Wu et al.
- Onellm: One Framework To Align All Modalities With Language Jiaming Han et al.
- Ureader: Universal Ocr-free Visually-situated Language Understanding With Multimodal Large Language Model Jiabo Ye et al.
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Leveraging Large Language Models For Sequential Recommendation Jesse Harte et al.
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- Physically Grounded Vision-language Models For Robotic Manipulation Jensen Gao et al.
- Symbol Tuning Improves In-context Learning In Language Models Jerry Wei et al.
- Evaluating Large Language Models On A Highly-specialized Topic, Radiation Oncology Physics Jason Holmes et al.
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- Paperqa: Retrieval-augmented Generative Agent For Scientific Research Jakub Lála et al.
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- Thrilled By Your Progress! Large Language Models (GPT-4) No Longer Struggle To Pass Assessments In Higher Education Programming Courses Jaromir Savelka, Arav Agarwal, Marshall An, Chris Bogart, Majd Sakr
- A Comparative Study Of Ai-generated (GPT-4) And Human-crafted Mcqs In Programming Education Jacob Doughty et al.
- Chip-chat: Challenges And Opportunities In Conversational Hardware Design Jason Blocklove, Siddharth Garg, Ramesh Karri, Hammond Pearce
- Chatgpt In The Classroom: An Analysis Of Its Strengths And Weaknesses For Solving Undergraduate Computer Science Questions Ishika Joshi et al.
- More Robots Are Coming: Large Multimodal Models (chatgpt) Can Solve Visually Diverse Images Of Parsons Problems Irene Hou et al.
- "it's A Fair Game", Or Is It? Examining How Users Navigate Disclosure Risks And Benefits When Using Llm-based Conversational Agents Zhiping Zhang et al.
- Cognitive Mirage: A Review Of Hallucinations In Large Language Models Hongbin Ye, Tong Liu, Aijia Zhang, Wei Hua, Weiqiang Jia
- Semantic Compression With Large Language Models Henry Gilbert, Michael Sandborn, Douglas C. Schmidt, Jesse Spencer-smith, Jules White
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Self-chained Image-language Model For Video Localization And Question Answering Shoubin Yu, Jaemin Cho, Prateek Yadav, Mohit Bansal
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- Factscore: Fine-grained Atomic Evaluation Of Factual Precision In Long Form Text Generation Sewon Min et al.
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- Ai-assisted Coding: Experiments With GPT-4 Russell A Poldrack, Thomas Lu, Gašper Beguš
- Are Emergent Abilities Of Large Language Models A Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo
- Prompting For Multimodal Hateful Meme Classification Rui Cao, Roy Ka-wei Lee, Wen-haw Chong, Jing Jiang
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- Retrieval-augmented Image Captioning Rita Ramos, Desmond Elliott, Bruno Martins
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- A Universal Question-answering Platform For Knowledge Graphs Reham Omar, Ishika Dhall, Panos Kalnis, Essam Mansour
- Pro-cap: Leveraging A Frozen Vision-language Model For Hateful Meme Detection Rui Cao et al.
- Lawyer Llama Technical Report Quzhe Huang et al.
- Mplug-owl: Modularization Empowers Large Language Models With Multimodality Qinghao Ye et al.
- Translating Radiology Reports Into Plain Language Using Chatgpt And GPT-4 With Prompt Learning: Promising Results, Limitations, And Potential Qing Lyu et al.
- ONCE: Boosting Content-based Recommendation With Both Open- And Closed-source Large Language Models Qijiong Liu, Nuo Chen, Tetsuya Sakai, Xiao-ming Wu
- Genegpt: Augmenting Large Language Models With Domain Tools For Improved Access To Biomedical Information Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu
- Prompting The Hidden Talent Of Web-scale Speech Models For Zero-shot Task Generalization Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Selfcheckgpt: Zero-resource Black-box Hallucination Detection For Generative Large Language Models Potsawee Manakul, Adian Liusie, Mark J. F. Gales
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Chat-univi: Unified Visual Representation Empowers Large Language Models With Image And Video Understanding Peng Jin, Ryuichi Takanobu, Wancai Zhang, Xiaochun Cao, Li Yuan
- Audiopalm: A Large Language Model That Can Speak And Listen Paul K. Rubenstein et al.
- Going Beyond Nouns With Vision & Language Models Using Synthetic Data Paola Cascante-bonilla et al.
- In-context Retrieval-augmented Language Models Ori Ram et al.
- Fine-tuning Or Retrieval? Comparing Knowledge Injection In Llms Oded Ovadia, Menachem Brief, Moshik Mishaeli, Oren Elisha
- Fusecap: Leveraging Large Language Models For Enriched Fused Image Captions Noam Rotstein, David Bensaid, Shaked Brody, Roy Ganz, Ron Kimmel
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- Chatgpt Is A Knowledgeable But Inexperienced Solver: An Investigation Of Commonsense Problem In Large Language Models Ning Bian et al.
- Bridging The Gap: A Survey On Integrating (human) Feedback For Natural Language Generation Patrick Fernandes et al.
- CAT-LM: Training Language Models On Aligned Code And Tests Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn
- Datatales: Investigating The Use Of Large Language Models For Authoring Data-driven Articles Nicole Sultanum, Arjun Srinivasan
- A Stitch In Time Saves Nine: Detecting And Mitigating Hallucinations Of Llms By Validating Low-confidence Generation Neeraj Varshney, Wenlin Yao, Hongming Zhang, Jianshu Chen, Dong Yu
- Label Supervised Llama Finetuning Zongxi Li et al.
- Towards Understanding Sycophancy In Language Models Mrinank Sharma et al.
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- Api-bank: A Comprehensive Benchmark For Tool-augmented Llms Minghao Li et al.
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Selenite: Scaffolding Online Sensemaking With Comprehensive Overviews Elicited From Large Language Models Michael Xieyang Liu et al.
- Large Language Models Are Effective Text Rankers With Pairwise Ranking Prompting Zhen Qin et al.
- Psy-llm: Scaling Up Global Mental Health Psychological Services With Ai-based Large Language Models Tin Lai et al.
- Empirical Study Of Zero-shot NER With Chatgpt Tingyu Xie et al.
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Grounding Large Language Models In Interactive Environments With Online Reinforcement Learning Thomas Carta et al.
- Few-shot In-context Learning For Knowledge Base Question Answering Tianle Li et al.
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Open-ended Medical Visual Question Answering Through Prefix Tuning Of Language Models Tom Van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring
- Mindfuldiary: Harnessing Large Language Model To Support Psychiatric Patients' Journaling Taewan Kim et al.
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- From Words To Watts: Benchmarking The Energy Costs Of Large Language Model Inference Siddharth Samsi et al.
- The Troubling Emergence Of Hallucination In Large Language Models -- An Extensive Definition, Quantification, And Prescriptive Remediations Vipula Rawte et al.
- Is GPT-4 A Reliable Rater? Evaluating Consistency In GPT-4 Text Ratings Veronika Hackl, Alexandra Elena Müller, Michael Granitzer, Maximilian Sailer
- Evaluating Correctness And Faithfulness Of Instruction-following Models For Question Answering Vaibhav Adlakha, Parishad Behnamghader, Xing Han Lu, Nicholas Meade, Siva Reddy
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Automating Human Tutor-style Programming Feedback: Leveraging GPT-4 Tutor Model For Hint Generation And GPT-3.5 Student Model For Hint Validation Tung Phung et al.
- Freshllms: Refreshing Large Language Models With Search Engine Augmentation Tu Vu et al.
- Large Language Models Fail On Trivial Alterations To Theory-of-mind Tasks Tomer Ullman
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- Copiloting The Copilots: Fusing Large Language Models With Completion Engines For Automated Program Repair Yuxiang Wei, Chunqiu Steven Xia, Lingming Zhang
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- Can Large Language Models Provide Useful Feedback On Research Papers? A Large-scale Empirical Analysis Weixin Liang et al.
- Chatgraph: Interpretable Text Classification By Converting Chatgpt Knowledge To Graphs Yucheng Shi et al.
- Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine Harsha Nori et al.
- Not All Languages Are Created Equal In Llms: Improving Multilingual Capability By Cross-lingual-thought Prompting Haoyang Huang et al.
- Improved Baselines With Visual Instruction Tuning Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Chatkbqa: A Generate-then-retrieve Framework For Knowledge Base Question Answering With Fine-tuned Large Language Models Haoran Luo et al.
- CMMLU: Measuring Massive Multitask Language Understanding In Chinese Haonan Li et al.
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Q-instruct: Improving Low-level Visual Abilities For Multi-modality Foundation Models Haoning Wu et al.
- Lmdrive: Closed-loop End-to-end Driving With Large Language Models Hao Shao et al.
- Languagempc: Large Language Models As Decision Makers For Autonomous Driving Hao Sha et al.
- Video-llama: An Instruction-tuned Audio-visual Language Model For Video Understanding Hang Zhang, Xin Li, Lidong Bing
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Auggpt: Leveraging Chatgpt For Text Data Augmentation Haixing Dai et al.
- Chatgpt Hallucinates When Attributing Answers Guido Zuccon, Bevan Koopman, Razia Shaik
- Dr Chatgpt, Tell Me What I Want To Hear: How Prompt Knowledge Impacts Health Answer Correctness Guido Zuccon, Bevan Koopman
- Augmented Language Models: A Survey Grégoire Mialon et al.
- Level Generation Through Large Language Models Graham Todd, Sam Earle, Muhammad Umair Nasir, Michael Cerny Green, Julian Togelius
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Synthetic Data Generation With Large Language Models For Text Classification: Potential And Limitations Zhuoyan Li, Hangxiao Zhu, Zhuoran Lu, Ming Yin
- Repocoder: Repository-level Code Completion Through Iterative Retrieval And Generation Fengji Zhang et al.
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Empower Large Language Model To Perform Better On Industrial Domain-specific Question Answering Fangkai Yang et al.
- Learning To Prompt In The Classroom To Understand AI Limits: A Pilot Study Emily Theophilou et al.
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Language Model Crossover: Variation Through Few-shot Prompting Elliot Meyerson et al.
- Vipergpt: Visual Inference Via Python Execution For Reasoning Dídac Surís, Sachit Menon, Carl Vondrick
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- MELTR: Meta Loss Transformer For Learning To Fine-tune Video Foundation Models Dohwan Ko et al.
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- Fine-tuning Chatgpt For Automatic Scoring Ehsan Latif, Xiaoming Zhai
- Almanac: Retrieval-augmented Language Models For Clinical Medicine Cyril Zakka et al.
- Improving Accuracy Of GPT-3/4 Results On Biomedical Data Using A Retrieval-augmented Language Model David Soong et al.
- Llava-med: Training A Large Language-and-vision Assistant For Biomedicine In One Day Chunyuan Li et al.
- Conversational Automated Program Repair Chunqiu Steven Xia, Lingming Zhang
- Opportunities And Risks Of Llms For Scalable Deliberation With Polis Christopher T. Small et al.
- A Study On The Implementation Of Generative AI Services Using An Enterprise Data-based LLM Application Architecture Cheonsu Jeong
- Llm-powered Data Augmentation For Enhanced Cross-lingual Performance Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Chatdev: Communicative Agents For Software Development Chen Qian et al.
- Supporting Human-ai Collaboration In Auditing Llms With Llms Charvi Rastogi, Marco Tulio Ribeiro, Nicholas King, Harsha Nori, Saleema Amershi
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Llmseceval: A Dataset Of Natural Language Prompts For Security Evaluations Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, Riccardo Scandariato
- Receive, Reason, And React: Drive As You Say With Large Language Models In Autonomous Vehicles Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- LLM+P: Empowering Large Language Models With Optimal Planning Proficiency Bo Liu et al.
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- Refactoring Programs Using Large Language Models With Few-shot Examples Atsushi Shirafuji, Yusuke Oda, Jun Suzuki, Makoto Morishita, Yutaka Watanobe
- The False Promise Of Imitating Proprietary Llms Arnav Gudibande et al.
- Interpretable Long-form Legal Question Answering With Retrieval-augmented Large Language Models Antoine Louis, Gijs Van Dijck, Gerasimos Spanakis
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models Anas Awadalla et al.
- Fighting Fire With Fire: Can Chatgpt Detect Ai-generated Text? Amrita Bhattacharjee, Huan Liu
- On Generative Agents In Recommendation An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-seng Chua
- Self-refine: Iterative Refinement With Self-feedback Aman Madaan et al.
- Poisoning Language Models During Instruction Tuning Alexander Wan, Eric Wallace, Sheng Shen, Dan Klein
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Clipsyntel: CLIP And LLM Synergy For Multimodal Question Summarization In Healthcare Akash Ghosh et al.
- Self-rag: Learning To Retrieve, Generate, And Critique Through Self-reflection Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi
- Mistral 7B Albert Q. Jiang et al.
- Enhancing Retrieval-augmented Large Language Models With Iterative Retrieval-generation Synergy Zhihong Shao et al.
- Translating Natural Language To Planning Goals With Large-language Models Yaqi Xie et al.
- Collaborative Large Language Model For Recommender Systems Yaochen Zhu, Liang Wu, Qi Guo, Liangjie Hong, Jundong Li
- Chatpose: Chatting About 3D Human Pose Yao Feng et al.
- On Learning To Summarize With Large Language Models As References Yixin Liu et al.
- Enhancing Job Recommendation Through Llm-based Generative Adversarial Networks Yingpeng Du et al.
- Can Chatgpt Reproduce Human-generated Labels? A Study Of Social Computing Tasks Yiming Zhu, Peixian Zhang, Ehsan-ul Haq, Pan Hui, Gareth Tyson
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- A Comparative Study Of Pretrained Language Models For Long Clinical Text Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Human-centric Autonomous Systems With Llms For User Command Reasoning Yi Yang et al.
- Llm-eval: Unified Multi-dimensional Automatic Evaluation For Open-domain Conversations With Large Language Models Yen-ting Lin, Yun-nung Chen
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- Hugginggpt: Solving AI Tasks With Chatgpt And Its Friends In Hugging Face Yongliang Shen et al.
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- How Far Can Camels Go? Exploring The State Of Instruction Tuning On Open Resources Yizhong Wang et al.
- Recmind: Large Language Model Powered Agent For Recommendation Yancheng Wang et al.
- Specinfer: Accelerating Generative Large Language Model Serving With Tree-based Speculative Inference And Verification Xupeng Miao et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Emotional Intelligence Of Large Language Models Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Liu Jia
- Integrating Action Knowledge And Llms For Task Planning And Situation Handling In Open Worlds Yan Ding et al.
- Can Chatgpt Pass The Vietnamese National High School Graduation Examination? Xuan-quy Dao, Ngoc-bich Le, Xuan-dung Phan, Bac-bien Ngo
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- Teaching Large Language Models To Self-debug Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- How Robust Is GPT-3.5 To Predecessors? A Comprehensive Study On Language Understanding Tasks Xuanting Chen et al.
- Mitigating Large Language Model Hallucinations Via Autonomous Knowledge Graph-based Retrofitting Xinyan Guan et al.
- Wavcaps: A Chatgpt-assisted Weakly-labelled Audio Captioning Dataset For Audio-language Multimodal Research Xinhao Mei et al.
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- Delving Into Multimodal Prompting For Fine-grained Visual Classification Xin Jiang et al.
- PMC-VQA: Visual Instruction Tuning For Medical Visual Question Answering Xiaoman Zhang et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Medagents: Large Language Models As Collaborators For Zero-shot Medical Reasoning Xiangru Tang et al.
- Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events Woosuk Seo, Chanmo Yang, Young-ho Kim
- Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
- Generative Recommendation: Towards Next-generation Recommender Paradigm Wenjie Wang, Xinyu Lin, Fuli Feng, Xiangnan He, Tat-seng Chua
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- Longbench: A Bilingual, Multitask Benchmark For Long Context Understanding Yushi Bai et al.
- Lampilot: An Open Benchmark Dataset For Autonomous Driving With Language Model Programs Yunsheng Ma et al.
- Large Language Models Are Versatile Decomposers: Decompose Evidence And Questions For Table-based Reasoning Yunhu Ye et al.
- Cachegen: KV Cache Compression And Streaming For Fast Large Language Model Serving Yuhan Liu et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- Aligning Large Language Models With Human: A Survey Yufei Wang et al.
- Textbooks Are All You Need II: Phi-1.5 Technical Report Yuanzhi Li et al.
- Preventing Zero-shot Transfer Degradation In Continual Learning Of Vision-language Models Zangwei Zheng et al.
- C-eval: A Multi-level Multi-discipline Chinese Evaluation Suite For Foundation Models Yuzhen Huang et al.
- Monitoring Ai-modified Content At Scale: A Case Study On The Impact Of Chatgpt On AI Conference Peer Reviews Weixin Liang et al.
- Chatbot Arena: An Open Platform For Evaluating Llms By Human Preference Wei-lin Chiang et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Continual Learning For Large Language Models: A Survey Tongtong Wu et al.
- Contextual AI Journaling: Integrating LLM And Time Series Behavioral Sensing Technology To Promote Self-reflection And Well-being Using The Mindscape App Subigya Nepal et al.
- Chatgpt As Research Scientist: Probing Gpt's Capabilities As A Research Librarian, Research Ethicist, Data Generator And Data Predictor Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R. Banaji
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- An Empirical Study On Usage And Perceptions Of Llms In A Software Engineering Project Sanka Rasnayaka, Guanlin Wang, Ridwan Shariffdeen, Ganesh Neelakanta Iyer
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- Me Llama: Foundation Large Language Models For Medical Applications Qianqian Xie et al.
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- SNIFFER: Multimodal Large Language Model For Explainable Out-of-context Misinformation Detection Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee
- Shaping Human-ai Collaboration: Varied Scaffolding Levels In Co-writing With Language Models Paramveer S. Dhillon et al.
- Jamba: A Hybrid Transformer-mamba Language Model Opher Lieber et al.
- CBR-RAG: Case-based Reasoning For Retrieval Augmented Generation In Llms For Legal Question Answering Nirmalie Wiratunga et al.
- When Large Language Model Agents Meet 6G Networks: Perception, Grounding, And Alignment Minrui Xu et al.
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- Xlstm: Extended Long Short-term Memory Maximilian Beck et al.
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Fine Tuning Vs. Retrieval Augmented Generation For Less Popular Knowledge Heydar Soudani, Evangelos Kanoulas, Faegheh Hasibi
- Benchmarking Retrieval-augmented Generation For Medicine Guangzhi Xiong, Qiao Jin, Zhiyong Lu, Aidong Zhang
- Materials Science In The Era Of Large Language Models: A Perspective Ge Lei, Ronan Docherty, Samuel J. Cooper
- Code-aware Prompting: A Study Of Coverage Guided Test Generation In Regression Setting Using LLM Gabriel Ryan et al.
- The Power Of Noise: Redefining Retrieval For RAG Systems Florin Cuconasu et al.
- Moe-llava: Mixture Of Experts For Large Vision-language Models Bin Lin et al.
- Large Language Models And User Trust: Consequence Of Self-referential Learning Loop And The Deskilling Of Healthcare Professionals Avishek Choudhury, Zaria Chaudhry
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- Optimization Methods For Personalizing Large Language Models Through Retrieval Augmentation Alireza Salemi, Surya Kallumadi, Hamed Zamani
- Does Fine-tuning Llms On New Knowledge Encourage Hallucinations? Zorik Gekhman et al.
- Llmparser: An Exploratory Study On Using Large Language Models For Log Parsing Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-hsun Chen, Shaowei Wang
- Let Me Do It For You: Towards LLM Empowered Recommendation Via Tool Learning Yuyue Zhao et al.
- Autocoderover: Autonomous Program Improvement Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, Abhik Roychoudhury
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Searching For Best Practices In Retrieval-augmented Generation Xiaohua Wang et al.
- A Survey On RAG Meeting Llms: Towards Retrieval-augmented Large Language Models Wenqi Fan et al.
- CRUD-RAG: A Comprehensive Chinese Benchmark For Retrieval-augmented Generation Of Large Language Models Yuanjie Lyu et al.
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
- Prompting Large Language Models With Rationale Heuristics For Knowledge-based Visual Question Answering Zhongjian Hu, Peng Yang, Bing Li, Fengyuan Liu
🏷 RecSys
- Generate Natural Language Explanations For Recommendation Hanxiong Chen, Xu Chen, Shaoyun Shi, Yongfeng Zhang
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Non-invasive Self-attention For Side Information Fusion In Sequential Recommendation Chang Liu et al.
- Towards Retrieval-based Conversational Recommendation Ahtsham Manzoor, Dietmar Jannach
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- Personalized Prompt Learning For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Towards Unified Conversational Recommender Systems Via Knowledge-enhanced Prompt Learning Xiaolei Wang, Kun Zhou, Ji-rong Wen, Wayne Xin Zhao
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- A Unified Multi-task Learning Framework For Multi-goal Conversational Recommender Systems Yang Deng et al.
- Meta Policy Learning For Cold-start Conversational Recommendation Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Agentcf: Collaborative Learning With Autonomous Language Agents For Recommender Systems Junjie Zhang et al.
- Recommendation As Instruction Following: A Large Language Model Empowered Recommendation Approach Junjie Zhang et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Recommender Systems With Generative Retrieval Shashank Rajput et al.
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- ONCE: Boosting Content-based Recommendation With Both Open- And Closed-source Large Language Models Qijiong Liu, Nuo Chen, Tetsuya Sakai, Xiao-ming Wu
- Pre-train, Prompt And Recommendation: A Comprehensive Survey Of Language Modelling Paradigm Adaptations In Recommender Systems Peng Liu, Lemei Zhang, Jon Atle Gulla
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- On Generative Agents In Recommendation An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-seng Chua
- Collaborative Large Language Model For Recommender Systems Yaochen Zhu, Liang Wu, Qi Guo, Liangjie Hong, Jundong Li
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- Rethinking The Evaluation For Conversational Recommendation In The Era Of Large Language Models Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-rong Wen
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Generative Recommendation: Towards Next-generation Recommender Paradigm Wenjie Wang, Xinyu Lin, Fuli Feng, Xiangnan He, Tat-seng Chua
- Large Language Models Are Zero-shot Rankers For Recommender Systems Yupeng Hou et al.
- Towards Open-world Recommendation With Knowledge Augmentation From Large Language Models Yunjia Xi et al.
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- Linrec: Linear Attention Mechanism For Long-term Sequential Recommender Systems Langming Liu et al.
- Harnessing Large Language Models For Text-rich Sequential Recommendation Zhi Zheng, Wenshuo Chao, Zhaopeng Qiu, Hengshu Zhu, Hui Xiong
- Let Me Do It For You: Towards LLM Empowered Recommendation Via Tool Learning Yuyue Zhao et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- A Review Of Modern Recommender Systems Using Generative Models (gen-recsys) Yashar Deldjoo et al.
🏷 Reinforcement Learning
- Deep Active Learning For Dialogue Generation Nabiha Asghar, Pascal Poupart, Xin Jiang, Hang Li
- An Actor-critic Algorithm For Sequence Prediction Dzmitry Bahdanau et al.
- Neural Text Generation From Structured Data With Application To The Biography Domain Remi Lebret, David Grangier, Michael Auli
- A User Simulator For Task-completion Dialogues Xiujun Li et al.
- Separating Answers From Queries For Neural Reading Comprehension Dirk Weissenborn
- Deep Reinforcement Learning For Dialogue Generation Jiwei Li et al.
- A Simple, Fast Diverse Decoding Algorithm For Neural Generation Jiwei Li, Will Monroe, Dan Jurafsky
- Generative Deep Neural Networks For Dialogue: A Short Review Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, Joelle Pineau
- Learning Python Code Suggestion With A Sparse Pointer Network Avishkar Bhoopchand, Tim Rocktäschel, Earl Barr, Sebastian Riedel
- Sample-efficient Actor-critic Reinforcement Learning With Supervised Data For Dialogue Management Pei-hao Su, Pawel Budzianowski, Stefan Ultes, Milica Gasic, Steve Young
- Mojitalk: Generating Emotional Responses At Scale Xianda Zhou, William Yang Wang
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- Adversarial Learning For Neural Dialogue Generation Jiwei Li et al.
- Ask The Right Questions: Active Question Reformulation With Reinforcement Learning Christian Buck et al.
- Searchqa: A New Q&A Dataset Augmented With Context From A Search Engine Matthew Dunn et al.
- Data Distillation For Controlling Specificity In Dialogue Generation Jiwei Li, Will Monroe, Dan Jurafsky
- Latent Intention Dialogue Models Tsung-hsien Wen, Yishu Miao, Phil Blunsom, Steve Young
- R\(^3\): Reinforced Reader-ranker For Open-domain Question Answering Shuohang Wang et al.
- End-to-end Optimization Of Goal-driven And Visually Grounded Dialogue Systems Florian Strub et al.
- Batch Policy Gradient Methods For Improving Neural Conversation Models Kirthevasan Kandasamy, Yoram Bachrach, Ryota Tomioka, Daniel Tarlow, David Carter
- Neural Response Generation With Dynamic Vocabularies Yu Wu et al.
- Parlai: A Dialog Research Software Platform Alexander H. Miller et al.
- A Deep Reinforcement Learning Chatbot Iulian V. Serban et al.
- Fine Grained Knowledge Transfer For Personalized Task-oriented Dialogue Systems Kaixiang Mo, Yu Zhang, Qiang Yang, Pascale Fung
- DP-GAN: Diversity-promoting Generative Adversarial Network For Generating Informative And Diversified Text Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu Sun
- Maskgan: Better Text Generation Via Filling In The______ William Fedus, Ian Goodfellow, Andrew M. Dai
- Dialogue Generation: From Imitation Learning To Inverse Reinforcement Learning Ziming Li, Julia Kiseleva, Maarten De Rijke
- Fast Abstractive Summarization With Reinforce-selected Sentence Rewriting Yen-chun Chen, Mohit Bansal
- The Memad Submission To The WMT18 Multimodal Translation Task Stig-arne Grönroos et al.
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Seq2seq-vis: A Visual Debugging Tool For Sequence-to-sequence Models Hendrik Strobelt et al.
- Can You Tell Me How To Get Past Sesame Street? Sentence-level Pretraining Beyond Language Modeling Alex Wang et al.
- Towards Explainable And Controllable Open Domain Dialogue Generation With Dialogue Acts Can Xu, Wei Wu, Yu Wu
- Babyai: A Platform To Study The Sample Efficiency Of Grounded Language Learning Maxime Chevalier-boisvert et al.
- Sentence Encoders On Stilts: Supplementary Training On Intermediate Labeled-data Tasks Jason Phang, Thibault Févry, Samuel R. Bowman
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Hybrid Retrieval-generation Reinforced Agent For Medical Image Report Generation Christy Y. Li, Xiaodan Liang, Zhiting Hu, Eric P. Xing
- Controllable Neural Story Plot Generation Via Reward Shaping Pradyumna Tambwekar et al.
- Disentangling Language And Knowledge In Task-oriented Dialogs Dinesh Raghu, Nikhil Gupta, Mausam
- Towards Exploiting Background Knowledge For Building Conversation Systems Nikita Moghe, Siddhartha Arora, Suman Banerjee, Mitesh M. Khapra
- Complex Sequential Question Answering: Towards Learning To Converse Over Linked Question Answer Pairs With A Knowledge Graph Amrita Saha, Vardaan Pahuja, Mitesh M. Khapra, Karthik Sankaranarayanan, Sarath Chandar
- On Evaluating And Comparing Open Domain Dialog Systems Anu Venkatesh et al.
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Conversational AI: The Science Behind The Alexa Prize Ashwin Ram et al.
- Guiding Policies With Language Via Meta-learning John D. Co-reyes et al.
- Is Multilingual BERT Fluent In Language Generation? Samuel Rönnqvist, Jenna Kanerva, Tapio Salakoski, Filip Ginter
- Ensemble-based Deep Reinforcement Learning For Chatbots Heriberto Cuayáhuitl et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- Conversing By Reading: Contentful Neural Conversation With On-demand Machine Reading Lianhui Qin et al.
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Entity-consistent End-to-end Task-oriented Dialogue System With KB Retriever Libo Qin et al.
- Revealing The Dark Secrets Of BERT Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky
- Nemo: A Toolkit For Building AI Applications Using Neural Modules Oleksii Kuchaiev et al.
- Counterfactual Story Reasoning And Generation Lianhui Qin et al.
- Generating Persona Consistent Dialogues By Exploiting Natural Language Inference Haoyu Song, Wei-nan Zhang, Jingwen Hu, Ting Liu
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- ELI5: Long Form Question Answering Angela Fan et al.
- Answering Complex Open-domain Questions Through Iterative Query Generation Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, Christopher D. Manning
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- Say What I Want: Towards The Dark Side Of Neural Dialogue Models Haochen Liu, Tyler Derr, Zitao Liu, Jiliang Tang
- Dykgchat: Benchmarking Dialogue Generation Grounding On Dynamic Knowledge Graphs Yi-lin Tuan, Yun-nung Chen, Hung-yi Lee
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- Sticking To The Facts: Confident Decoding For Faithful Data-to-text Generation Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh
- Multifit: Efficient Multi-lingual Language Model Fine-tuning Julian Martin Eisenschlos et al.
- Dialogue Transformers Vladimir Vlasov, Johannes E. M. Mosig, Alan Nichol
- Compressive Transformers For Long-range Sequence Modelling Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Pay Less Attention With Lightweight And Dynamic Convolutions Felix Wu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli
- Empdg: Multiresolution Interactive Empathetic Dialogue Generation Qintong Li et al.
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Zero: Memory Optimizations Toward Training Trillion Parameter Models Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He
- Do Neural Language Representations Learn Physical Commonsense? Maxwell Forbes, Ari Holtzman, Yejin Choi
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Rankqa: Neural Question Answering With Answer Re-ranking Bernhard Kratzwald, Anna Eigenmann, Stefan Feuerriegel
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Codegru: Context-aware Deep Learning With Gated Recurrent Unit For Source Code Modeling Yasir Hussain, Zhiqiu Huang, Yu Zhou, Senzhang Wang
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Reinforced Dynamic Reasoning For Conversational Question Generation Boyuan Pan, Hao Li, Ziyu Yao, Deng Cai, Huan Sun
- Insertion-based Decoding With Automatically Inferred Generation Order Jiatao Gu, Qi Liu, Kyunghyun Cho
- QASC: A Dataset For Question Answering Via Sentence Composition Tushar Khot, Peter Clark, Michal Guerquin, Peter Jansen, Ashish Sabharwal
- Multi-hop Question Answering Via Reasoning Chains Jifan Chen, Shih-ting Lin, Greg Durrett
- CTRL: A Conditional Transformer Language Model For Controllable Generation Nitish Shirish Keskar, Bryan Mccann, Lav R. Varshney, Caiming Xiong, Richard Socher
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Stabilizing Transformers For Reinforcement Learning Emilio Parisotto et al.
- Abductive Commonsense Reasoning Chandra Bhagavatula et al.
- 12-in-1: Multi-task Vision And Language Representation Learning Jiasen Lu, Vedanuj Goswami, Marcus Rohrbach, Devi Parikh, Stefan Lee
- Reinforcement Learning Based Emotional Editing Constraint Conversation Generation Jia Li, Xiao Sun, Xing Wei, Changliang Li, Jianhua Tao
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Fine-tuning Language Models From Human Preferences Daniel M. Ziegler et al.
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- Unsupervised Question Answering By Cloze Translation Patrick Lewis, Ludovic Denoyer, Sebastian Riedel
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- A Survey Of Natural Language Generation Techniques With A Focus On Dialogue Systems - Past, Present And Future Directions Sashank Santhanam, Samira Shaikh
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Roberta: A Robustly Optimized BERT Pretraining Approach Yinhan Liu et al.
- Generating Empathetic Responses By Looking Ahead The User's Sentiment Jamin Shin, Peng Xu, Andrea Madotto, Pascale Fung
- GLTR: Statistical Detection And Visualization Of Generated Text Sebastian Gehrmann, Hendrik Strobelt, Alexander M. Rush
- Juice: A Large Scale Distantly Supervised Dataset For Open Domain Context-based Code Generation Rajas Agashe, Srinivasan Iyer, Luke Zettlemoyer
- Using Natural Language For Reward Shaping In Reinforcement Learning Prasoon Goyal, Scott Niekum, Raymond J. Mooney
- Explain Yourself! Leveraging Language Models For Commonsense Reasoning Nazneen Fatema Rajani, Bryan Mccann, Caiming Xiong, Richard Socher
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- Latent Retrieval For Weakly Supervised Open Domain Question Answering Kenton Lee, Ming-wei Chang, Kristina Toutanova
- Countering Language Drift Via Visual Grounding Jason Lee, Kyunghyun Cho, Douwe Kiela
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Pretrained Encyclopedia: Weakly Supervised Knowledge-pretrained Language Model Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Fairseq: A Fast, Extensible Toolkit For Sequence Modeling Myle Ott et al.
- Incremental Transformer With Deliberation Decoder For Document Grounded Conversations Zekang Li et al.
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- Attention-informed Mixed-language Training For Zero-shot Cross-lingual Task-oriented Dialogue Systems Zihan Liu, Genta Indra Winata, Zhaojiang Lin, Peng Xu, Pascale Fung
- Encoder-agnostic Adaptation For Conditional Language Generation Zachary M. Ziegler, Luke Melas-kyriazi, Sebastian Gehrmann, Alexander M. Rush
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- Controlled Hallucinations: Learning To Generate Faithfully From Noisy Data Katja Filippova
- Modelling Hierarchical Structure Between Dialogue Policy And Natural Language Generator With Option Framework For Task-oriented Dialogue System Jianhong Wang, Yuan Zhang, Tae-kyun Kim, Yunjie Gu
- Measuring Systematic Generalization In Neural Proof Generation With Transformers Nicolas Gontier, Koustuv Sinha, Siva Reddy, Christopher Pal
- SPARTA: Efficient Open-domain Question Answering Via Sparse Transformer Matching Retrieval Tiancheng Zhao, Xiaopeng Lu, Kyusong Lee
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- Improved Natural Language Generation Via Loss Truncation Daniel Kang, Tatsunori Hashimoto
- Fine-tuning Pretrained Language Models: Weight Initializations, Data Orders, And Early Stopping Jesse Dodge et al.
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- Reducing Gender Bias In Neural Machine Translation As A Domain Adaptation Problem Danielle Saunders, Bill Byrne
- The Chess Transformer: Mastering Play Using Generative Language Models David Noever, Matt Ciolino, Josh Kalin
- Intermediate-task Transfer Learning With Pretrained Models For Natural Language Understanding: When And Why Does It Work? Yada Pruksachatkun et al.
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- CG-BERT: Conditional Text Generation With BERT For Generalized Few-shot Intent Detection Congying Xia, Chenwei Zhang, Hoang Nguyen, Jiawei Zhang, Philip Yu
- Pymt5: Multi-mode Translation Of Natural Language And Python Code With Transformers Colin B. Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, Neel Sundaresan
- Deebert: Dynamic Early Exiting For Accelerating BERT Inference Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin
- If Beam Search Is The Answer, What Was The Question? Clara Meister, Tim Vieira, Ryan Cotterell
- Detecting Hallucinated Content In Conditional Neural Sequence Generation Chunting Zhou et al.
- Masking As An Efficient Alternative To Finetuning For Pretrained Language Models Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze
- Meaningful Answer Generation Of E-commerce Question-answering Shen Gao, Xiuying Chen, Zhaochun Ren, Dongyan Zhao, Rui Yan
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- A Knowledge-enhanced Pretraining Model For Commonsense Story Generation Jian Guan, Fei Huang, Zhihao Zhao, Xiaoyan Zhu, Minlie Huang
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- KG-BART: Knowledge Graph-augmented BART For Generative Commonsense Reasoning Ye Liu, Yao Wan, Lifang He, Hao Peng, Philip S. Yu
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Explaining Question Answering Models Through Text Generation Veronica Latcinnik, Jonathan Berant
- BANG: Bridging Autoregressive And Non-autoregressive Generation With Large Scale Pretraining Weizhen Qi et al.
- Learning To Recombine And Resample Data For Compositional Generalization Ekin Akyürek, Afra Feyza Akyürek, Jacob Andreas
- Alfworld: Aligning Text And Embodied Environments For Interactive Learning Mohit Shridhar et al.
- Inducing Language-agnostic Multilingual Representations Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein
- Gshard: Scaling Giant Models With Conditional Computation And Automatic Sharding Dmitry Lepikhin et al.
- Logic2text: High-fidelity Natural Language Generation From Logical Forms Zhiyu Chen et al.
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- Visbert: Hidden-state Visualizations For Transformers Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- When Being Unseen From Mbert Is Just The Beginning: Handling New Languages With Multilingual Language Models Benjamin Muller, Antonis Anastasopoulos, Benoît Sagot, Djamé Seddah
- Tabert: Pretraining For Joint Understanding Of Textual And Tabular Data Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- PONE: A Novel Automatic Evaluation Metric For Open-domain Generative Dialogue Systems Tian Lan, Xian-ling Mao, Wei Wei, Xiaoyan Gao, Heyan Huang
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- DUMA: Reading Comprehension With Transposition Thinking Pengfei Zhu, Hai Zhao, Xiaoguang Li
- Robust Conversational AI With Grounded Text Generation Jianfeng Gao et al.
- The Turking Test: Can Language Models Understand Instructions? Avia Efrat, Omer Levy
- Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-initiative Conversations Ashwin Paranjape et al.
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- MEGATRON-CNTRL: Controllable Story Generation With External Knowledge Using Large-scale Language Models Peng Xu et al.
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- Proofwriter: Generating Implications, Proofs, And Abductive Statements Over Natural Language Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark
- Leap-of-thought: Teaching Pre-trained Models To Systematically Reason Over Implicit Knowledge Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Coregen: Contextualized Code Representation Learning For Commit Message Generation Lun Yiu Nie et al.
- Intellicode Compose: Code Generation Using Transformer Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan
- Facts As Experts: Adaptable And Interpretable Neural Memory Over Symbolic Knowledge Pat Verga, Haitian Sun, Livio Baldini Soares, William W. Cohen
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- Grounding Language To Autonomously-acquired Skills Via Goal Generation Ahmed Akakzia, Cédric Colas, Pierre-yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
- Question And Answer Test-train Overlap In Open-domain Question Answering Datasets Patrick Lewis, Pontus Stenetorp, Sebastian Riedel
- Mixkd: Towards Efficient Distillation Of Large-scale Language Models Kevin J Liang et al.
- Retrieval-augmented Generation For Knowledge-intensive NLP Tasks Patrick Lewis et al.
- A Survey Of Knowledge-enhanced Text Generation Wenhao Yu et al.
- XTREME: A Massively Multilingual Multi-task Benchmark For Evaluating Cross-lingual Generalization Junjie Hu et al.
- Assessing Phrasal Representation And Composition In Transformers Lang Yu, Allyson Ettinger
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- X-FACTR: Multilingual Factual Knowledge Retrieval From Pretrained Language Models Zhengbao Jiang, Antonios Anastasopoulos, Jun Araki, Haibo Ding, Graham Neubig
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- Probing Pretrained Language Models For Lexical Semantics Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
- Human Instruction-following With Deep Reinforcement Learning Via Transfer-learning From Text Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- Trading Off Diversity And Quality In Natural Language Generation Hugh Zhang, Daniel Duckworth, Daphne Ippolito, Arvind Neelakantan
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- How Context Affects Language Models' Factual Predictions Fabio Petroni et al.
- Grounded Language Learning Fast And Slow Felix Hill et al.
- Template Guided Text Generation For Task-oriented Dialogue Mihir Kale, Abhinav Rastogi
- Multilingual Translation With Extensible Multilingual Pretraining And Finetuning Yuqing Tang et al.
- Mixup-transformer: Dynamic Data Augmentation For NLP Tasks Lichao Sun et al.
- Scientific Claim Verification With VERT5ERINI Ronak Pradeep, Xueguang Ma, Rodrigo Nogueira, Jimmy Lin
- Document Ranking With A Pretrained Sequence-to-sequence Model Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Ruibo Liu et al.
- Language Generation With Multi-hop Reasoning On Commonsense Knowledge Graph Haozhe Ji et al.
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- On Optimal Transformer Depth For Low-resource Language Translation Elan Van Biljon, Arnu Pretorius, Julia Kreutzer
- An Empirical Study On Robustness To Spurious Correlations Using Pre-trained Language Models Lifu Tu, Garima Lalwani, Spandana Gella, He He
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Mitigating Gender Bias For Neural Dialogue Generation With Adversarial Learning Haochen Liu et al.
- BERT Loses Patience: Fast And Robust Inference With Early Exit Wangchunshu Zhou et al.
- Funnel-transformer: Filtering Out Sequential Redundancy For Efficient Language Processing Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- An Exploratory Study On Long Dialogue Summarization: What Works And What's Next Yusen Zhang et al.
- Generate Natural Language Explanations For Recommendation Hanxiong Chen, Xu Chen, Shaoyun Shi, Yongfeng Zhang
- On Transferability Of Prompt Tuning For Natural Language Processing Yusheng Su et al.
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- Wenlan: Bridging Vision And Language By Large-scale Multi-modal Pre-training Yuqi Huo et al.
- P-tuning V2: Prompt Tuning Can Be Comparable To Fine-tuning Universally Across Scales And Tasks Xiao Liu et al.
- Learning How To Ask: Querying Lms With Mixtures Of Soft Prompts Guanghui Qin, Jason Eisner
- Cutting Down On Prompts And Parameters: Simple Few-shot Learning With Language Models Robert L. Iv Logan et al.
- Evaluating The Robustness Of Neural Language Models To Input Perturbations Milad Moradi, Matthias Samwald
- Emotion-aware Chat Machine: Automatic Emotional Response Generation For Human-like Emotional Interaction Wei Wei et al.
- Explaining Documents' Relevance To Search Queries Razieh Rahimi, Youngwoo Kim, Hamed Zamani, James Allan
- On The Safety Of Conversational Models: Taxonomy, Dataset, And Benchmark Hao Sun et al.
- Less Is More: Pre-train A Strong Text Encoder For Dense Retrieval Using A Weak Decoder Shuqi Lu et al.
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Thank You BART! Rewarding Pre-trained Models Improves Formality Style Transfer Huiyuan Lai, Antonio Toral, Malvina Nissim
- Unsupervised Corpus Aware Language Model Pre-training For Dense Passage Retrieval Luyu Gao, Jamie Callan
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Mitigating Political Bias In Language Models Through Reinforced Calibration Ruibo Liu et al.
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- One Chatbot Per Person: Creating Personalized Chatbots Based On Implicit User Profiles Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, Ji-rong Wen
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Arat5: Text-to-text Transformers For Arabic Language Generation El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-mageed
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- Efficient Large-scale Language Model Training On GPU Clusters Using Megatron-lm Deepak Narayanan et al.
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- The Impact Of Multiple Parallel Phrase Suggestions On Email Input And Composition Behaviour Of Native And Non-native English Writers Daniel Buschek, Martin Zürn, Malin Eiband
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- Automated Quality Assessment Of Cognitive Behavioral Therapy Sessions Through Highly Contextualized Language Representations Nikolaos Flemotomos et al.
- True Few-shot Learning With Prompts -- A Real-world Perspective Timo Schick, Hinrich Schütze
- Why Do Pretrained Language Models Help In Downstream Tasks? An Analysis Of Head And Prompt Tuning Colin Wei, Sang Michael Xie, Tengyu Ma
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Exploring Prompt-based Few-shot Learning For Grounded Dialog Generation Chujie Zheng, Minlie Huang
- Can Generative Pre-trained Language Models Serve As Knowledge Bases For Closed-book QA? Cunxiang Wang, Pai Liu, Yue Zhang
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- Counterfactual Memorization In Neural Language Models Chiyuan Zhang et al.
- Pre-train, Prompt, And Predict: A Systematic Survey Of Prompting Methods In Natural Language Processing Pengfei Liu et al.
- N\"UWA: Visual Synthesis Pre-training For Neural Visual World Creation Chenfei Wu et al.
- RAFT: A Real-world Few-shot Text Classification Benchmark Neel Alex et al.
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- Adversarial GLUE: A Multi-task Benchmark For Robustness Evaluation Of Language Models Boxin Wang et al.
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners Ningyu Zhang et al.
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- HTLM: Hyper-text Pre-training And Prompting Of Language Models Armen Aghajanyan et al.
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- Characterchat: Supporting The Creation Of Fictional Characters Through Conversation And Progressive Manifestation With A Chatbot Oliver Schmitt, Daniel Buschek
- Predicting The Performance Of Multilingual NLP Models Anirudh Srinivasan et al.
- Math Word Problem Generation With Mathematical Consistency And Problem Context Constraints Zichao Wang, Andrew S. Lan, Richard G. Baraniuk
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- Wordcraft: A Human-ai Collaborative Editor For Story Writing Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- Worst Of Both Worlds: Biases Compound In Pre-trained Vision-and-language Models Tejas Srinivasan, Yonatan Bisk
- Image Captioning For Effective Use Of Language Models In Knowledge-based Visual Question Answering Ander Salaberria, Gorka Azkune, Oier Lopez De Lacalle, Aitor Soroa, Eneko Agirre
- A General Language Assistant As A Laboratory For Alignment Amanda Askell et al.
- FLAVA: A Foundational Language And Vision Alignment Model Amanpreet Singh et al.
- Few-shot Question Answering By Pretraining Span Selection Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy
- Webqa: Multihop And Multimodal QA Yingshan Chang et al.
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- One Question Answering Model For Many Languages With Cross-lingual Dense Passage Retrieval Akari Asai, Xinyan Yu, Jungo Kasai, Hannaneh Hajishirzi
- Towards Retrieval-based Conversational Recommendation Ahtsham Manzoor, Dietmar Jannach
- Bertese: Learning To Speak To BERT Adi Haviv, Jonathan Berant, Amir Globerson
- Commitbert: Commit Message Generation Using Pre-trained Programming Language Model Tae-hwan Jung
- Quiz-style Question Generation For News Stories Adam D. Lelkes, Vinh Q. Tran, Cong Yu
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- Embodied BERT: A Transformer Model For Embodied, Language-guided Visual Task Completion Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme
- Visqa: X-raying Vision And Language Reasoning In Transformers Theo Jaunet et al.
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Unlocking Compositional Generalization In Pre-trained Models Using Intermediate Representations Jonathan Herzig et al.
- Cotext: Multi-task Learning With Code-text Transformer Long Phan et al.
- Challenges In Detoxifying Language Models Johannes Welbl et al.
- Towards Continual Knowledge Learning Of Language Models Joel Jang et al.
- Scifive: A Text-to-text Transformer Model For Biomedical Literature Long N. Phan et al.
- Beyond Goldfish Memory: Long-term Open-domain Conversation Jing Xu, Arthur Szlam, Jason Weston
- Using Adversarial Attacks To Reveal The Statistical Bias In Machine Reading Comprehension Models Jieyu Lin, Jiajie Zou, Nai Ding
- SIMMC 2.0: A Task-oriented Dialog Dataset For Immersive Multimodal Conversations Satwik Kottur, Seungwhan Moon, Alborz Geramifard, Babak Damavandi
- Sentence-t5: Scalable Sentence Encoders From Pre-trained Text-to-text Models Jianmo Ni et al.
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- Metaicl: Learning To Learn In Context Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi
- Long Text Generation By Modeling Sentence-level And Discourse-level Coherence Jian Guan et al.
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- Multimodal Few-shot Learning With Frozen Language Models Maria Tsimpoukelli et al.
- Recursively Summarizing Books With Human Feedback Jeff Wu et al.
- Scaling Language Models: Methods, Analysis & Insights From Training Gopher Jack W. Rae et al.
- Program Synthesis With Large Language Models Jacob Austin et al.
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Webgpt: Browser-assisted Question-answering With Human Feedback Reiichiro Nakano et al.
- Redditbias: A Real-world Resource For Bias Evaluation And Debiasing Of Conversational Language Models Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
- Hurdles To Progress In Long-form Question Answering Kalpesh Krishna, Aurko Roy, Mohit Iyyer
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- Dialogue History Matters! Personalized Response Selectionin Multi-turn Retrieval-based Chatbots Juntao Li et al.
- Quality: Question Answering With Long Input Texts, Yes! Richard Yuanzhe Pang et al.
- Reframing Human-ai Collaboration For Generating Free-text Explanations Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi
- MATE: Multi-view Attention For Table Transformer Efficiency Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, William W. Cohen
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- On The Effectiveness Of Adapter-based Tuning For Pretrained Language Model Adaptation Ruidan He et al.
- Training Verifiers To Solve Math Word Problems Karl Cobbe et al.
- Language Models Show Human-like Content Effects On Reasoning Tasks Ishita Dasgupta et al.
- Progprompt: Generating Situated Robot Task Plans Using Large Language Models Ishika Singh et al.
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Evaluating Mixed-initiative Conversational Search Systems Via User Simulation Ivan Sekulić, Mohammad Aliannejadi, Fabio Crestani
- Webshop: Towards Scalable Real-world Web Interaction With Grounded Language Agents Shunyu Yao, Howard Chen, John Yang, Karthik Narasimhan
- Coderl: Mastering Code Generation Through Pretrained Models And Deep Reinforcement Learning Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- A Survey On Retrieval-augmented Text Generation Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
- Exploring Visual Prompts For Adapting Large-scale Models Hyojin Bahng, Ali Jahanian, Swami Sankaranarayanan, Phillip Isola
- Robotic Skill Acquisition Via Instruction Augmentation With Vision-language Models Ted Xiao et al.
- CLIP-TD: CLIP Targeted Distillation For Vision-language Tasks Zhecan Wang et al.
- Interactive And Visual Prompt Engineering For Ad-hoc Task Adaptation With Large Language Models Hendrik Strobelt et al.
- Interleaving Retrieval With Chain-of-thought Reasoning For Knowledge-intensive Multi-step Questions Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- Repair Is Nearly Generation: Multilingual Program Repair With Llms Harshit Joshi et al.
- Inner Monologue: Embodied Reasoning Through Planning With Language Models Wenlong Huang et al.
- Language Models As Zero-shot Planners: Extracting Actionable Knowledge For Embodied Agents Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch
- Murag: Multimodal Retrieval-augmented Generator For Open Question Answering Over Images And Text Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen
- Large Language Models Are Few(1)-shot Table Reasoners Wenhu Chen
- Reasoning With Language Model Prompting: A Survey Shuofei Qiao et al.
- Less Is More: Learning To Refine Dialogue History For Personalized Dialogue Generation Hanxun Zhong, Zhicheng Dou, Yutao Zhu, Hongjin Qian, Ji-rong Wen
- Diffusiondb: A Large-scale Prompt Gallery Dataset For Text-to-image Generative Models Zijie J. Wang et al.
- How To Prompt? Opportunities And Challenges Of Zero- And Few-shot Learning For Human-ai Interaction In Creative Applications Of Generative Models Hai Dang, Lukas Mecke, Florian Lehmann, Sven Goller, Daniel Buschek
- Lost At C: A User Study On The Security Implications Of Large Language Model Code Assistants Gustavo Sandoval et al.
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Dialfred: Dialogue-enabled Agents For Embodied Instruction Following Xiaofeng Gao et al.
- Promptcap: Prompt-guided Task-aware Image Captioning Yushi Hu et al.
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- On The Transferability Of Pre-trained Language Models For Low-resource Programming Languages Fuxiang Chen, Fatemeh Fard, David Lo, Timofey Bryksin
- Language Models Are Multilingual Chain-of-thought Reasoners Freda Shi et al.
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Planbench: An Extensible Benchmark For Evaluating Large Language Models On Planning And Reasoning About Change Karthik Valmeekam, Matthew Marquez, Alberto Olmo, Sarath Sreedharan, Subbarao Kambhampati
- Alexatm 20B: Few-shot Learning Using A Large-scale Multilingual Seq2seq Model Saleh Soltan et al.
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Evolution Through Large Models Joel Lehman et al.
- News Summarization And Evaluation In The Era Of GPT-3 Tanya Goyal, Junyi Jessy Li, Greg Durrett
- Do Large Language Models Know What Humans Know? Sean Trott, Cameron Jones, Tyler Chang, James Michaelov, Benjamin Bergen
- Incorporating Domain Knowledge Through Task Augmentation For Front-end Javascript Code Generation Sijie Shen et al.
- Unified-io: A Unified Model For Vision, Language, And Multi-modal Tasks Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- React: Synergizing Reasoning And Acting In Language Models Shunyu Yao et al.
- Flamingo: A Visual Language Model For Few-shot Learning Jean-baptiste Alayrac et al.
- BERTIN: Efficient Pre-training Of A Spanish Language Model Using Perplexity Sampling Javier De La Rosa et al.
- Visconde: Multi-document QA With GPT-3 And Neural Reranking Jayr Pereira, Robson Fidalgo, Roberto Lotufo, Rodrigo Nogueira
- Convfinqa: Exploring The Chain Of Numerical Reasoning In Conversational Finance Question Answering Zhiyu Chen et al.
- Teaching Language Models To Support Answers With Verified Quotes Jacob Menick et al.
- Language Models As Agent Models Jacob Andreas
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- Neural Theory-of-mind? On The Limits Of Social Intelligence In Large Lms Maarten Sap, Ronan Lebras, Daniel Fried, Yejin Choi
- RARR: Researching And Revising What Language Models Say, Using Language Models Luyu Gao et al.
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- Prompting Is Programming: A Query Language For Large Language Models Luca Beurer-kellner, Marc Fischer, Martin Vechev
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- In-context Examples Selection For Machine Translation Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, Marjan Ghazvininejad
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Phenaki: Variable Length Video Generation From Open Domain Textual Description Ruben Villegas et al.
- The Goldilocks Of Pragmatic Understanding: Fine-tuning Strategy Matters For Implicature Resolution By Llms Laura Ruis et al.
- Galactica: A Large Language Model For Science Ross Taylor et al.
- Memorization Without Overfitting: Analyzing The Training Dynamics Of Large Language Models Kushal Tirumala, Aram H. Markosyan, Luke Zettlemoyer, Armen Aghajanyan
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- Do As I Can, Not As I Say: Grounding Language In Robotic Affordances Michael Ahn et al.
- Is Reinforcement Learning (not) For Natural Language Processing: Benchmarks, Baselines, And Building Blocks For Natural Language Policy Optimization Rajkumar Ramamurthy et al.
- Who Is GPT-3? An Exploration Of Personality, Values And Demographics Marilù Miotto, Nicola Rossberg, Bennett Kleinberg
- Efficient Long-text Understanding With Short-text Models Maor Ivgi, Uri Shaham, Jonathan Berant
- Language Models Are Realistic Tabular Data Generators Vadim Borisov, Kathrin Seßler, Tobias Leemann, Martin Pawelczyk, Gjergji Kasneci
- Can Large Language Models Reason About Medical Questions? Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Can Machines Help Us Answering Question 16 In Datasheets, And In Turn Reflecting On Inappropriate Content? Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
- Self-conditioned Embedding Diffusion For Text Generation Robin Strudel et al.
- Pangu-coder: Program Synthesis With Function-level Language Modeling Fenia Christopoulou et al.
- SKILL: Structured Knowledge Infusion For Large Language Models Fedor Moiseev, Zhe Dong, Enrique Alfonseca, Martin Jaggi
- Deplot: One-shot Visual Language Reasoning By Plot-to-table Translation Fangyu Liu et al.
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Long Time No See! Open-domain Conversation With Long-term Persona Memory Xinchao Xu et al.
- Red Teaming Language Models With Language Models Ethan Perez et al.
- Matcha: Enhancing Visual Language Pretraining With Math Reasoning And Chart Derendering Fangyu Liu et al.
- Greaselm: Graph Reasoning Enhanced Language Models For Question Answering Xikun Zhang et al.
- Memory-based Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn
- Quark: Controllable Text Generation With Reinforced Unlearning Ximing Lu et al.
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- Legal Prompt Engineering For Multilingual Legal Judgement Prediction Dietrich Trautmann, Alina Petrova, Frank Schilder
- Successive Prompting For Decomposing Complex Questions Dheeru Dua, Shivanshu Gupta, Sameer Singh, Matt Gardner
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Least-to-most Prompting Enables Complex Reasoning In Large Language Models Denny Zhou et al.
- Protoclip: Prototypical Contrastive Language Image Pretraining Delong Chen et al.
- Future Transformer For Long-term Action Anticipation Dayoung Gong, Joonseok Lee, Manjin Kim, Seong Jong Ha, Minsu Cho
- Prompting Palm For Translation: Assessing Strategies And Performance David Vilar et al.
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- Language And Culture Internalisation For Human-like Autotelic AI Cédric Colas, Tristan Karch, Clément Moulin-frier, Pierre-yves Oudeyer
- Unified Vision And Language Prompt Learning Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
- Discovering Latent Knowledge In Language Models Without Supervision Collin Burns, Haotian Ye, Dan Klein, Jacob Steinhardt
- Augesc: Dialogue Augmentation With Large Language Models For Emotional Support Conversation Chujie Zheng, Sahand Sabour, Jiaxin Wen, Zheng Zhang, Minlie Huang
- Competition-level Code Generation With Alphacode Yujia Li et al.
- Cont: Contrastive Neural Text Generation Chenxin An et al.
- Linearly Mapping From Image To Text Space Jack Merullo, Louis Castricato, Carsten Eickhoff, Ellie Pavlick
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- Texts As Images In Prompt Tuning For Multi-label Image Recognition Zixian Guo et al.
- Exploring Length Generalization In Large Language Models Cem Anil et al.
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Why Does Surprisal From Larger Transformer-based Language Models Provide A Poorer Fit To Human Reading Times? Byung-doh Oh, William Schuler
- Iteratively Prompt Pre-trained Language Models For Chain Of Thought Boshi Wang, Xiang Deng, Huan Sun
- Survey Of Hallucination In Natural Language Generation Ziwei Ji et al.
- Multimodal Knowledge Alignment With Reinforcement Learning Youngjae Yu et al.
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- Analogy Generation By Prompting Large Language Models: A Case Study Of Instructgpt Bhavya Bhavya, Jinjun Xiong, Chengxiang Zhai
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Faithful Reasoning Using Large Language Models Antonia Creswell, Murray Shanahan
- Selection-inference: Exploiting Large Language Models For Interpretable Logical Reasoning Antonia Creswell, Murray Shanahan, Irina Higgins
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Socratic Models: Composing Zero-shot Multimodal Reasoning With Language Andy Zeng et al.
- The AI Teacher Test: Measuring The Pedagogical Ability Of Blender And GPT-3 In Educational Dialogues Anaïs Tack, Chris Piech
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- Prompt-to-prompt Image Editing With Cross Attention Control Amir Hertz et al.
- Improving Alignment Of Dialogue Agents Via Targeted Human Judgements Amelia Glaese et al.
- Can Language Models Learn From Explanations In Context? Andrew K. Lampinen et al.
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- Standing On The Shoulders Of Giant Frozen Language Models Yoav Levine et al.
- When Not To Trust Language Models: Investigating Effectiveness Of Parametric And Non-parametric Memories Alex Mallen et al.
- REVEAL: Retrieval-augmented Visual-language Pre-training With Multi-source Multimodal Knowledge Memory Ziniu Hu et al.
- Empowering Language Models With Knowledge Graph Reasoning For Question Answering Ziniu Hu et al.
- Dualprompt: Complementary Prompting For Rehearsal-free Continual Learning Zifeng Wang et al.
- What Is It Like To Program With Artificial Intelligence? Advait Sarkar et al.
- A New Path: Scaling Vision-and-language Navigation With Synthetic Instructions And Imitation Learning Aishwarya Kamath et al.
- Language Models Are Greedy Reasoners: A Systematic Formal Analysis Of Chain-of-thought Abulhair Saparov, He He
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models Aarohi Shammie Srivastava et al.
- Solving Quantitative Reasoning Problems With Language Models Aitor Lewkowycz et al.
- Dynamic Prompt Learning Via Policy Gradient For Semi-structured Mathematical Reasoning Pan Lu et al.
- ROSCOE: A Suite Of Metrics For Scoring Step-by-step Reasoning Olga Golovneva et al.
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- On The Origin Of Hallucinations In Conversational Models: Is It The Datasets Or The Models? Nouha Dziri, Sivan Milton, Mo Yu, Osmar Zaiane, Siva Reddy
- Grounding Language With Visual Affordances Over Unstructured Data Oier Mees, Jessica Borja-diaz, Wolfram Burgard
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Faithdial: A Faithful Benchmark For Information-seeking Dialogue Nouha Dziri et al.
- No Language Left Behind: Scaling Human-centered Machine Translation Nllb Team et al.
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- Large Language Models Struggle To Learn Long-tail Knowledge Nikhil Kandpal, Haikang Deng, Adam Roberts, Eric Wallace, Colin Raffel
- Generate Rather Than Retrieve: Large Language Models Are Strong Context Generators Wenhao Yu et al.
- Maple: Multi-modal Prompt Learning Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan
- Fine-tuned Language Models Are Continual Learners Thomas Scialom, Tuhin Chakrabarty, Smaranda Muresan
- Transformer Feed-forward Layers Build Predictions By Promoting Concepts In The Vocabulary Space Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- Evaluating Human-language Model Interaction Mina Lee et al.
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Coauthor: Designing A Human-ai Collaborative Writing Dataset For Exploring Language Model Capabilities Mina Lee, Percy Liang, Qian Yang
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Confident Adaptive Language Modeling Tal Schuster et al.
- Make-a-video: Text-to-video Generation Without Text-video Data Uriel Singer et al.
- Meta Policy Learning For Cold-start Conversational Recommendation Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu
- Help Me Write A Poem: Instruction Tuning As A Vehicle For Collaborative Poetry Writing Tuhin Chakrabarty, Vishakh Padmakumar, He He
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Holistic Evaluation Of Language Models Percy Liang et al.
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- Conversing With Copilot: Exploring Prompt Engineering For Solving CS1 Problems Using Natural Language Paul Denny, Viraj Kumar, Nasser Giacaman
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Do Large Language Models Resemble Humans In Language Use? Zhenguang G. Cai, Xufeng Duan, David A. Haslett, Shuqi Wang, Martin J. Pickering
- From Image To Language: A Critical Analysis Of Visual Question Answering (VQA) Approaches, Challenges, And Opportunities Md Farhan Ishmam, Md Sakib Hossain Shovon, M. F. Mridha, Nilanjan Dey
- A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets Md Tahmid Rahman Laskar et al.
- Leancontext: Cost-efficient Domain-specific Question Answering Using Llms Md Adnan Arefeen, Biplob Debnath, Srimat Chakradhar
- Gptaraeval: A Comprehensive Evaluation Of Chatgpt On Arabic NLP Md Tawkat Islam Khondaker, Abdul Waheed, El Moatez Billah Nagoudi, Muhammad Abdul-mageed
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- Few-shot Fine-tuning Vs. In-context Learning: A Fair Comparison And Evaluation Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- Flexkbqa: A Flexible Llm-powered Framework For Few-shot Knowledge Base Question Answering Zhenyu Li et al.
- Natural Language Generation And Understanding Of Big Code For Ai-assisted Programming: A Review Man Fai Wong, Shangxin Guo, Ching Nam Hang, Siu Wai Ho, Chee Wei Tan
- The Reversal Curse: Llms Trained On "A Is B" Fail To Learn "B Is A" Lukas Berglund et al.
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Llm-grounded Diffusion: Enhancing Prompt Understanding Of Text-to-image Diffusion Models With Large Language Models Long Lian, Boyi Li, Adam Yala, Trevor Darrell
- Generative Artificial Intelligence In Learning Analytics: Contextualising Opportunities And Challenges Through The Learning Analytics Cycle Lixiang Yan, Roberto Martinez-maldonado, Dragan Gašević
- Practical And Ethical Challenges Of Large Language Models In Education: A Systematic Scoping Review Lixiang Yan et al.
- Comparing Sentence-level Suggestions To Message-level Suggestions In Ai-mediated Communication Liye Fu, Benjamin Newman, Maurice Jakesch, Sarah Kreps
- A Bibliometric Review Of Large Language Models Research From 2017 To 2023 Lizhou Fan et al.
- From Word Models To World Models: Translating From Natural Language To The Probabilistic Language Of Thought Lionel Wong et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- Do Llms Exhibit Human-like Response Biases? A Case Study In Survey Design Lindia Tjuatja, Valerie Chen, Sherry Tongshuang Wu, Ameet Talwalkar, Graham Neubig
- Reasoning On Graphs: Faithful And Interpretable Large Language Model Reasoning Linhao Luo, Yuan-fang Li, Gholamreza Haffari, Shirui Pan
- Next-step Hint Generation For Introductory Programming Using Large Language Models Lianne Roest, Hieke Keuning, Johan Jeuring
- Can Chatgpt Replace Stackoverflow? A Study On Robustness And Reliability Of Large Language Model Code Generation Li Zhong, Zilong Wang
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- A Survey On Hallucination In Large Language Models: Principles, Taxonomy, Challenges, And Open Questions Lei Huang et al.
- Zero-shot Next-item Recommendation Using Large Pretrained Language Models Lei Wang, Ee-peng Lim
- Dissociating Language And Thought In Large Language Models Kyle Mahowald et al.
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Superclue: A Comprehensive Chinese Large Language Model Benchmark Liang Xu et al.
- Domain-specific Chatbots For Science Using Embeddings Kevin G. Yager
- Inference-time Intervention: Eliciting Truthful Answers From A Language Model Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Speak, Memory: An Archaeology Of Books Known To Chatgpt/gpt-4 Kent K. Chang, Mackenzie Cramer, Sandeep Soni, David Bamman
- Just Ask For Calibration: Strategies For Eliciting Calibrated Confidence Scores From Language Models Fine-tuned With Human Feedback Katherine Tian et al.
- Evaluating Language Models For Mathematics Through Interactions Katherine M. Collins et al.
- Automatic Prompt Augmentation And Selection With Chain-of-thought From Labeled Data Kashun Shum, Shizhe Diao, Tong Zhang
- Geochat: Grounded Large Vision-language Model For Remote Sensing Kartik Kuckreja et al.
- Topical-chat: Towards Knowledge-grounded Open-domain Conversations Karthik Gopalakrishnan et al.
- Towards Expert-level Medical Question Answering With Large Language Models Karan Singhal et al.
- The Imitation Game: Detecting Human And Ai-generated Texts In The Era Of Chatgpt And BARD Kadhim Hayawi, Sakib Shahriar, Sujith Samuel Mathew
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- Writer-defined AI Personas For On-demand Feedback Generation Karim Benharrak, Tim Zindulka, Florian Lehmann, Hendrik Heuer, Daniel Buschek
- Ai-augmented Surveys: Leveraging Large Language Models And Surveys For Opinion Prediction Junsol Kim, Byungkyu Lee
- Agentcf: Collaborative Learning With Autonomous Language Agents For Recommender Systems Junjie Zhang et al.
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Recommendation As Instruction Following: A Large Language Model Empowered Recommendation Approach Junjie Zhang et al.
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- Chatcounselor: A Large Language Models For Mental Health Support June M. Liu et al.
- Spear Phishing With Large Language Models Julian Hazell
- Towards Llm-based Autograding For Short Textual Answers Johannes Schneider, Bernd Schenk, Christina Niklaus
- LEXTREME: A Multi-lingual And Multi-task Benchmark For The Legal Domain Joel Niklaus et al.
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Qwen Technical Report Jinze Bai et al.
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- Structgpt: A General Framework For Large Language Model To Reason Over Structured Data Jinhao Jiang et al.
- Generating Images With Multimodal Language Models Jing Yu Koh, Daniel Fried, Ruslan Salakhutdinov
- Grounding Language Models To Images For Multimodal Inputs And Outputs Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried
- When Large Language Models Meet Personalization: Perspectives Of Challenges And Opportunities Jin Chen et al.
- Prompt-and-align: Prompt-based Social Alignment For Few-shot Fake News Detection Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Set-of-mark Prompting Unleashes Extraordinary Visual Grounding In GPT-4V Jianwei Yang et al.
- Ethical Chatgpt: Concerns, Challenges, And Commandments Jianlong Zhou, Heimo Müller, Andreas Holzinger, Fang Chen
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- Ureader: Universal Ocr-free Visually-situated Language Understanding With Multimodal Large Language Model Jiabo Ye et al.
- VILA: On Pre-training For Visual Language Models Ji Lin et al.
- Large Language Models In Medicine: The Potentials And Pitfalls Jesutofunmi A. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou
- Physically Grounded Vision-language Models For Robotic Manipulation Jensen Gao et al.
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- Thrilled By Your Progress! Large Language Models (GPT-4) No Longer Struggle To Pass Assessments In Higher Education Programming Courses Jaromir Savelka, Arav Agarwal, Marshall An, Chris Bogart, Majd Sakr
- A Comparative Study Of Ai-generated (GPT-4) And Human-crafted Mcqs In Programming Education Jacob Doughty et al.
- Simple And Controllable Music Generation Jade Copet et al.
- Chip-chat: Challenges And Opportunities In Conversational Hardware Design Jason Blocklove, Siddharth Garg, Ramesh Karri, Hammond Pearce
- Fake News In Sheep's Clothing: Robust Fake News Detection Against Llm-empowered Style Attacks Jiaying Wu, Jiafeng Guo, Bryan Hooi
- More Robots Are Coming: Large Multimodal Models (chatgpt) Can Solve Visually Diverse Images Of Parsons Problems Irene Hou et al.
- Chainforge: A Visual Toolkit For Prompt Engineering And LLM Hypothesis Testing Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Martin Wattenberg, Elena Glassman
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Theory Of Mind For Multi-agent Collaboration Via Large Language Models Huao Li et al.
- "it's A Fair Game", Or Is It? Examining How Users Navigate Disclosure Risks And Benefits When Using Llm-based Conversational Agents Zhiping Zhang et al.
- Building Cooperative Embodied Agents Modularly With Large Language Models Hongxin Zhang et al.
- Fingpt: Open-source Financial Large Language Models Hongyang Yang, Xiao-yang Liu, Christina Dan Wang
- Doctorglm: Fine-tuning Your Chinese Doctor Is Not A Herculean Task Honglin Xiong et al.
- Llmind: Orchestrating AI And Iot With LLM For Complex Task Execution Hongwei Cui, Yuyang Du, Qun Yang, Yulin Shao, Soung Chang Liew
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- Boosting Theory-of-mind Performance In Large Language Models Via Prompting Shima Rahimi Moghaddam, Christopher J. Honey
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Large Language Model Augmented Narrative Driven Recommendations Sheshera Mysore, Andrew Mccallum, Hamed Zamani
- Toolkengpt: Augmenting Frozen Language Models With Massive Tools Via Tool Embeddings Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu
- Scaling Vision-language Models With Sparse Mixture Of Experts Sheng Shen et al.
- The Flan Collection: Designing Data And Methods For Effective Instruction Tuning Shayne Longpre et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- Factscore: Fine-grained Atomic Evaluation Of Factual Precision In Long Form Text Generation Sewon Min et al.
- Language Is Not All You Need: Aligning Perception With Language Models Shaohan Huang et al.
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- A Comparative Study Of Open-source Large Language Models, GPT-4 And Claude 2: Multiple-choice Test Taking In Nephrology Sean Wu et al.
- Medalign: A Clinician-generated Dataset For Instruction Following With Electronic Medical Records Scott L. Fleming et al.
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- Let's Have A Chat! A Conversation With Chatgpt: Technology, Applications, And Limitations Sakib Shahriar, Kadhim Hayawi
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- Chatgpt Vs. Google: A Comparative Study Of Search Performance And User Experience Ruiyun Rayna Xu, Yue Katherine Feng, Hailiang Chen
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Audiogpt: Understanding And Generating Speech, Music, Sound, And Talking Head Rongjie Huang et al.
- Palm 2 Technical Report Rohan Anil et al.
- In-context Learning Creates Task Vectors Roee Hendel, Mor Geva, Amir Globerson
- Llm-assisted Content Analysis: Using Large Language Models To Support Deductive Coding Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim
- Retrieval-augmented Image Captioning Rita Ramos, Desmond Elliott, Bruno Martins
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- A Universal Question-answering Platform For Knowledge Graphs Reham Omar, Ishika Dhall, Panos Kalnis, Essam Mansour
- Automatic Prompt Optimization With "gradient Descent" And Beam Search Reid Pryzant et al.
- Prompt-based Distribution Alignment For Unsupervised Domain Adaptation Shuanghao Bai et al.
- Chatgpt As A Factual Inconsistency Evaluator For Text Summarization Zheheng Luo, Qianqian Xie, Sophia Ananiadou
- VELMA: Verbalization Embodiment Of LLM Agents For Vision And Language Navigation In Street View Raphael Schumann et al.
- How Secure Is Code Generated By Chatgpt? Raphaël Khoury, Anderson R. Avila, Jacob Brunelle, Baba Mamadou Camara
- Sabi\'a: Portuguese Large Language Models Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira
- Large Language Models Predict Human Sensory Judgments Across Six Modalities Raja Marjieh, Ilia Sucholutsky, Pol Van Rijn, Nori Jacoby, Thomas L. Griffiths
- Can We Trust The Evaluation On Chatgpt? Rachith Aiyappa, Jisun An, Haewoon Kwak, Yong-yeol Ahn
- Lawyer Llama Technical Report Quzhe Huang et al.
- Direct Preference Optimization: Your Language Model Is Secretly A Reward Model Rafael Rafailov et al.
- Mplug-owl: Modularization Empowers Large Language Models With Multimodality Qinghao Ye et al.
- Adalora: Adaptive Budget Allocation For Parameter-efficient Fine-tuning Qingru Zhang et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Medcpt: Contrastive Pre-trained Transformers With Large-scale Pubmed Search Logs For Zero-shot Biomedical Information Retrieval Qiao Jin et al.
- Designerly Understanding: Information Needs For Model Transparency To Support Design Ideation For Ai-powered User Experience Q. Vera Liao, Hariharan Subramonyam, Jennifer Wang, Jennifer Wortman Vaughan
- AI Transparency In The Age Of Llms: A Human-centered Research Roadmap Q. Vera Liao, Jennifer Wortman Vaughan
- "it Felt Like Having A Second Mind": Investigating Human-ai Co-creativity In Prewriting With Large Language Models Qian Wan et al.
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Students' Perceptions And Preferences Of Generative Artificial Intelligence Feedback For Programming Zhengdong Zhang et al.
- Visually-prompted Language Model For Fine-grained Scene Graph Generation In An Open World Qifan Yu et al.
- Chat-univi: Unified Visual Representation Empowers Large Language Models With Image And Video Understanding Peng Jin, Ryuichi Takanobu, Wancai Zhang, Xiaochun Cao, Li Yuan
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- Graphologue: Exploring Large Language Model Responses With Interactive Diagrams Peiling Jiang, Jude Rayan, Steven P. Dow, Haijun Xia
- Starcoder: May The Source Be With You! Raymond Li et al.
- VISAR: A Human-ai Argumentative Writing Assistant With Visual Programming And Rapid Draft Prototyping Zheng Zhang, Jie Gao, Ranjodh Singh Dhaliwal, Toby Jia-jun Li
- Going Beyond Nouns With Vision & Language Models Using Synthetic Data Paola Cascante-bonilla et al.
- Internlm-xcomposer: A Vision-language Large Model For Advanced Text-image Comprehension And Composition Pan Zhang et al.
- In-context Retrieval-augmented Language Models Ori Ram et al.
- GPT-4 Technical Report Openai et al.
- Ontochatgpt Information System: Ontology-driven Structured Prompts For Chatgpt Meta-learning Oleksandr Palagin, Vladislav Kaverinskiy, Anna Litvin, Kyrylo Malakhov
- Hallucinations In Large Multilingual Translation Models Nuno M. Guerreiro et al.
- Large Language Models Are Built-in Autoregressive Search Engines Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Chatgpt Is A Knowledgeable But Inexperienced Solver: An Investigation Of Commonsense Problem In Large Language Models Ning Bian et al.
- Datatales: Investigating The Use Of Large Language Models For Authoring Data-driven Articles Nicole Sultanum, Arjun Srinivasan
- A Stitch In Time Saves Nine: Detecting And Mitigating Hallucinations Of Llms By Validating Low-confidence Generation Neeraj Varshney, Wenlin Yao, Hongming Zhang, Jianshu Chen, Dong Yu
- Large Language Models Are Zero-shot Time Series Forecasters Nate Gruver, Marc Finzi, Shikai Qiu, Andrew Gordon Wilson
- Chatgpt MT: Competitive For High- (but Not Low-) Resource Languages Nathaniel R. Robinson, Perez Ogayo, David R. Mortensen, Graham Neubig
- Consistency Analysis Of Chatgpt Myeongjun Erik Jang, Thomas Lukasiewicz
- Benefits And Harms Of Large Language Models In Digital Mental Health Munmun De Choudhury, Sachin R. Pendse, Neha Kumar
- Label Supervised Llama Finetuning Zongxi Li et al.
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- Abscribe: Rapid Exploration & Organization Of Multiple Writing Variations In Human-ai Co-writing Tasks Using Large Language Models Mohi Reza et al.
- Introducing Language Guidance In Prompt-based Continual Learning Muhammad Gul Zain Ali Khan et al.
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Scalable Extraction Of Training Data From (production) Language Models Milad Nasr et al.
- Detecting Llm-generated Text In Computing Education: A Comparative Study For Chatgpt Cases Michael Sheinman Orenstrakh, Oscar Karnalim, Carlos Anibal Suarez, Michael Liut
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Hyena Hierarchy: Towards Larger Convolutional Language Models Michael Poli et al.
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- Psy-llm: Scaling Up Global Mental Health Psychological Services With Ai-based Large Language Models Tin Lai et al.
- Toolformer: Language Models Can Teach Themselves To Use Tools Timo Schick et al.
- Spqr: A Sparse-quantized Representation For Near-lossless LLM Weight Compression Tim Dettmers et al.
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Enabling Large Language Models To Generate Text With Citations Tianyu Gao, Howard Yen, Jiatong Yu, Danqi Chen
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Grounding Large Language Models In Interactive Environments With Online Reinforcement Learning Thomas Carta et al.
- Deception Abilities Emerged In Large Language Models Thilo Hagendorff
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Open-ended Medical Visual Question Answering Through Prefix Tuning Of Language Models Tom Van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring
- Mindfuldiary: Harnessing Large Language Model To Support Psychiatric Patients' Journaling Taewan Kim et al.
- Sparks Of Artificial General Intelligence: Early Experiments With GPT-4 Sébastien Bubeck et al.
- Large Language Models As General Pattern Machines Suvir Mirchandani et al.
- Pretraining Language Models With Human Preferences Tomasz Korbak et al.
- AI, Write An Essay For Me: A Large-scale Comparison Of Human-written Versus Chatgpt-generated Essays Steffen Herbold, Annette Hautli-janisz, Ute Heuer, Zlata Kikteva, Alexander Trautsch
- Expressive Text-to-image Generation With Rich Text Songwei Ge, Taesung Park, Jun-yan Zhu, Jia-bin Huang
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Principled Instructions Are All You Need For Questioning Llama-1/2, GPT-3.5/4 Sondos Mahmoud Bsharat, Aidar Myrzakhan, Zhiqiang Shen
- Llm-empowered Chatbots For Psychiatrist And Patient Simulation: Application And Evaluation Siyuan Chen et al.
- On The Possibilities Of Ai-generated Text Detection Souradip Chakraborty et al.
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- Mind Meets Machine: Unravelling Gpt-4's Cognitive Psychology Sifatkaur Dhingra, Manmeet Singh, Vaisakh Sb, Neetiraj Malviya, Sukhpal Singh Gill
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Retrieving Supporting Evidence For Generative Question Answering Siqing Huo, Negar Arabzadeh, Charles L. A. Clarke
- Do Llms Understand User Preferences? Evaluating Llms On User Rating Prediction Wang-cheng Kang et al.
- Memorybank: Enhancing Large Language Models With Long-term Memory Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, Yanlin Wang
- LIDA: A Tool For Automatic Generation Of Grammar-agnostic Visualizations And Infographics Using Large Language Models Victor Dibia
- Scaling Down To Scale Up: A Guide To Parameter-efficient Fine-tuning Vladislav Lialin, Vijeta Deshpande, Xiaowei Yao, Anna Rumshisky
- Evaluating Correctness And Faithfulness Of Instruction-following Models For Question Answering Vaibhav Adlakha, Parishad Behnamghader, Xing Han Lu, Nicholas Meade, Siva Reddy
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Automating Human Tutor-style Programming Feedback: Leveraging GPT-4 Tutor Model For Hint Generation And GPT-3.5 Student Model For Hint Validation Tung Phung et al.
- Generative AI For Programming Education: Benchmarking Chatgpt, GPT-4, And Human Tutors Tung Phung et al.
- Freshllms: Refreshing Large Language Models With Search Engine Augmentation Tu Vu et al.
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Nemo Guardrails: A Toolkit For Controllable And Safe LLM Applications With Programmable Rails Traian Rebedea, Razvan Dinu, Makesh Sreedhar, Christopher Parisien, Jonathan Cohen
- Creativity Support In The Age Of Large Language Models: An Empirical Study Involving Emerging Writers Tuhin Chakrabarty, Vishakh Padmakumar, Faeze Brahman, Smaranda Muresan
- Automatic Semantic Augmentation Of Language Model Prompts (for Code Summarization) Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
- Trusting Your Evidence: Hallucinate Less With Context-aware Decoding Weijia Shi et al.
- Copiloting The Copilots: Fusing Large Language Models With Completion Engines For Automated Program Repair Yuxiang Wei, Chunqiu Steven Xia, Lingming Zhang
- Alpha-clip: A CLIP Model Focusing On Wherever You Want Zeyi Sun et al.
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- Can Large Language Models Provide Useful Feedback On Research Papers? A Large-scale Empirical Analysis Weixin Liang et al.
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- BLIVA: A Simple Multimodal LLM For Better Handling Of Text-rich Visual Questions Wenbo Hu et al.
- Roco: Dialectic Multi-robot Collaboration With Large Language Models Zhao Mandi, Shreeya Jain, Shuran Song
- Large Language Models As Zero-shot Conversational Recommenders Zhankui He et al.
- Promptify: Text-to-image Generation Through Interactive Prompt Exploration With Large Language Models Stephen Brade, Bryan Wang, Mauricio Sousa, Sageev Oore, Tovi Grossman
- Chatgpt For PLC/DCS Control Logic Generation Heiko Koziolek, Sten Gruener, Virendra Ashiwal
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Chatgpt Or Grammarly? Evaluating Chatgpt On Grammatical Error Correction Benchmark Haoran Wu, Wenxuan Wang, Yuxuan Wan, Wenxiang Jiao, Michael Lyu
- Autodroid: Llm-powered Task Automation In Android Hao Wen et al.
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Chain Of Hindsight Aligns Language Models With Feedback Hao Liu, Carmelo Sferrazza, Pieter Abbeel
- Visual Instruction Tuning Haotian Liu, Chunyuan Li, Qingyang Wu, Yong Jae Lee
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Llm-rec: Personalized Recommendation Via Prompting Large Language Models Hanjia Lyu et al.
- Prompting Large Language Models For Topic Modeling Han Wang et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Wizardmath: Empowering Mathematical Reasoning For Large Language Models Via Reinforced Evol-instruct Haipeng Luo et al.
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Revisiting Large Language Models As Zero-shot Relation Extractors Guozheng Li, Peng Wang, Wenjun Ke
- Chatgpt Hallucinates When Attributing Answers Guido Zuccon, Bevan Koopman, Razia Shaik
- The Refinedweb Dataset For Falcon LLM: Outperforming Curated Corpora With Web Data, And Web Data Only Guilherme Penedo et al.
- Perspectives On Large Language Models For Relevance Judgment Guglielmo Faggioli et al.
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Lawbench: Benchmarking Legal Knowledge Of Large Language Models Zhiwei Fei et al.
- Synthetic Data Generation With Large Language Models For Text Classification: Potential And Limitations Zhuoyan Li, Hangxiao Zhu, Zhuoran Lu, Ming Yin
- Do Large Language Models Show Decision Heuristics Similar To Humans? A Case Study Using GPT-3.5 Gaurav Suri, Lily R. Slater, Ali Ziaee, Morgan Nguyen
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Lost In Translation: Large Language Models In Non-english Content Analysis Gabriel Nicholas, Aliya Bhatia
- Repocoder: Repository-level Code Completion Through Iterative Retrieval And Generation Fengji Zhang et al.
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Assigning AI: Seven Approaches For Students, With Prompts Ethan Mollick, Lilach Mollick
- Aligning Large Multimodal Models With Factually Augmented RLHF Zhiqing Sun et al.
- Kosmos-2: Grounding Multimodal Large Language Models To The World Zhiliang Peng et al.
- Simulating H.P. Lovecraft Horror Literature With The Chatgpt Large Language Model Eduardo C. Garrido-merchán, José Luis Arroyo-barrigüete, Roberto Gozalo-brizuela
- The Falcon Series Of Open Language Models Ebtesam Almazrouei et al.
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- MELTR: Meta Loss Transformer For Learning To Fine-tune Video Foundation Models Dohwan Ko et al.
- Evaluating Open-domain Question Answering In The Era Of Large Language Models Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei
- The Vector Grounding Problem Dimitri Coelho Mollo, Raphaël Millière
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- Chatgpt Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions Deyao Zhu et al.
- The Capacity For Moral Self-correction In Large Language Models Deep Ganguli et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Palm-e: An Embodied Multimodal Language Model Danny Driess et al.
- Almanac: Retrieval-augmented Language Models For Clinical Medicine Cyril Zakka et al.
- Weak-to-strong Generalization: Eliciting Strong Capabilities With Weak Supervision Collin Burns et al.
- LIMA: Less Is More For Alignment Chunting Zhou et al.
- Drivelm: Driving With Graph Visual Question Answering Chonghao Sima et al.
- Opportunities And Risks Of Llms For Scalable Deliberation With Polis Christopher T. Small et al.
- Debiasing Vision-language Models Via Biased Prompts Ching-yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka
- Macaw-llm: Multi-modal Language Modeling With Image, Audio, Video, And Text Integration Chenyang Lyu et al.
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- Dipping Plms Sauce: Bridging Structure And Text For Effective Knowledge Graph Completion Via Conditional Soft Prompting Chen Chen, Yufei Wang, Aixin Sun, Bing Li, Kwok-yan Lam
- Memgpt: Towards Llms As Operating Systems Charles Packer et al.
- MME: A Comprehensive Evaluation Benchmark For Multimodal Large Language Models Chaoyou Fu et al.
- Supporting Human-ai Collaboration In Auditing Llms With Llms Charvi Rastogi, Marco Tulio Ribeiro, Nicholas King, Harsha Nori, Saleema Amershi
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- Does GPT-4 Pass The Turing Test? Cameron R. Jones, Benjamin K. Bergen
- Receive, Reason, And React: Drive As You Say With Large Language Models In Autonomous Vehicles Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang
- Chatgpt And A New Academic Reality: Artificial Intelligence-written Research Papers And The Ethics Of The Large Language Models In Scholarly Publishing Brady Lund et al.
- Reinforced Self-training (rest) For Language Modeling Caglar Gulcehre et al.
- Large Language Models On Graphs: A Comprehensive Survey Bowen Jin et al.
- LLM+P: Empowering Large Language Models With Optimal Planning Proficiency Bo Liu et al.
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- Seed-bench-2: Benchmarking Multimodal Large Language Models Bohao Li et al.
- A Short Survey Of Viewing Large Language Models In Legal Aspect Zhongxiang Sun
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- ART: Automatic Multi-step Reasoning And Tool-use For Large Language Models Bhargavi Paranjape et al.
- Can Large Language Models Transform Computational Social Science? Caleb Ziems et al.
- Large Language Models In The Workplace: A Case Study On Prompt Engineering For Job Type Classification Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, Thomas Brightwell
- Bad Actor, Good Advisor: Exploring The Role Of Large Language Models In Fake News Detection Beizhe Hu et al.
- Friend Or Foe? Exploring The Implications Of Large Language Models On The Science System Benedikt Fecher, Marcel Hebing, Melissa Laufer, Jörg Pohle, Fabian Sofsky
- Code Llama: Open Foundation Models For Code Baptiste Rozière et al.
- Check Your Facts And Try Again: Improving Large Language Models With External Knowledge And Automated Feedback Baolin Peng et al.
- Instruction Tuning With GPT-4 Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- A Study Of Generative Large Language Model For Medical Research And Healthcare Cheng Peng et al.
- Facilitating Self-guided Mental Health Interventions Through Human-language Model Interaction: A Case Study Of Cognitive Restructuring Ashish Sharma, Kevin Rushton, Inna Wanyin Lin, Theresa Nguyen, Tim Althoff
- The False Promise Of Imitating Proprietary Llms Arnav Gudibande et al.
- Exploring The Responses Of Large Language Models To Beginner Programmers' Help Requests Arto Hellas et al.
- Scaling Transformer To 1M Tokens And Beyond With RMT Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, Mikhail S. Burtsev
- Orca 2: Teaching Small Language Models How To Reason Arindam Mitra et al.
- Chatgpt: Applications, Opportunities, And Threats Aram Bahrini et al.
- Med-halt: Medical Domain Hallucination Test For Large Language Models Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu
- Interpretable Long-form Legal Question Answering With Retrieval-augmented Large Language Models Antoine Louis, Gijs Van Dijck, Gerasimos Spanakis
- Detecting And Preventing Hallucinations In Large Vision Language Models Anisha Gunjal, Jihan Yin, Erhan Bas
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Toxicchat: Unveiling Hidden Challenges Of Toxicity Detection In Real-world User-ai Conversation Zi Lin et al.
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- Chemcrow: Augmenting Large-language Models With Chemistry Tools Andres M Bran et al.
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- On Generative Agents In Recommendation An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-seng Chua
- Self-refine: Iterative Refinement With Self-feedback Aman Madaan et al.
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Language Model Tokenizers Introduce Unfairness Between Languages Aleksandar Petrov, Emanuele La Malfa, Philip H. S. Torr, Adel Bibi
- Conversational Ai-powered Design: Chatgpt As Designer, User, And Product A. Baki Kocaballi
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- Enhancing Retrieval-augmented Large Language Models With Iterative Retrieval-generation Synergy Zhihong Shao et al.
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- Chatpose: Chatting About 3D Human Pose Yao Feng et al.
- Powerinfer: Fast Large Language Model Serving With A Consumer-grade GPU Yixin Song, Zeyu Mi, Haotong Xie, Haibo Chen
- Better To Ask In English: Cross-lingual Evaluation Of Large Language Models For Healthcare Queries Yiqiao Jin et al.
- 3D-LLM: Injecting The 3D World Into Large Language Models Yining Hong et al.
- On Learning To Summarize With Large Language Models As References Yixin Liu et al.
- Enhancing Job Recommendation Through Llm-based Generative Adversarial Networks Yingpeng Du et al.
- Can Chatgpt Replace Traditional KBQA Models? An In-depth Analysis Of The Question Answering Performance Of The GPT LLM Family Yiming Tan et al.
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- "kelly Is A Warm Person, Joseph Is A Role Model": Gender Biases In Llm-generated Reference Letters Yixin Wan et al.
- Human-centric Autonomous Systems With Llms For User Command Reasoning Yi Yang et al.
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Assessing Cross-cultural Alignment Between Chatgpt And Human Societies: An Empirical Study Yong Cao et al.
- Analyzing And Mitigating Object Hallucination In Large Vision-language Models Yiyang Zhou et al.
- Key-locked Rank One Editing For Text-to-image Personalization Yoad Tewel, Rinon Gal, Gal Chechik, Yuval Atzmon
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- Llavar: Enhanced Visual Instruction Tuning For Text-rich Image Understanding Yanzhe Zhang et al.
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Llama-vid: An Image Is Worth 2 Tokens In Large Language Models Yanwei Li, Chengyao Wang, Jiaya Jia
- Specializing Smaller Language Models Towards Multi-step Reasoning Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, Tushar Khot
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- A Survey On Model Compression For Large Language Models Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang
- Emotional Intelligence Of Large Language Models Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Liu Jia
- Chat With The Environment: Interactive Multimodal Perception Using Large Language Models Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
- Integrating Action Knowledge And Llms For Task Planning And Situation Handling In Open Worlds Yan Ding et al.
- Can Chatgpt Pass The Vietnamese National High School Graduation Examination? Xuan-quy Dao, Ngoc-bich Le, Xuan-dung Phan, Bac-bien Ngo
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- How Robust Is GPT-3.5 To Predecessors? A Comprehensive Study On Language Understanding Tasks Xuanting Chen et al.
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- PMC-VQA: Visual Instruction Tuning For Medical Visual Question Answering Xiaoman Zhang et al.
- LISA: Reasoning Segmentation Via Large Language Model Xin Lai et al.
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Deceptive AI Ecosystems: The Case Of Chatgpt Xiao Zhan, Yifan Xu, Stefan Sarkadi
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Beyond Chatbots: Explorellm For Structured Thoughts And Personalized Model Responses Xiao Ma et al.
- Medagents: Large Language Models As Collaborators For Zero-shot Medical Reasoning Xiangru Tang et al.
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events Woosuk Seo, Chanmo Yang, Young-ho Kim
- Cogagent: A Visual Language Model For GUI Agents Wenyi Hong et al.
- Language Models Represent Space And Time Wes Gurnee, Max Tegmark
- M3exam: A Multilingual, Multimodal, Multilevel Benchmark For Examining Large Language Models Wenxuan Zhang, Sharifah Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing
- Large Language Models In Education: Vision And Opportunities Wensheng Gan, Zhenlian Qi, Jiayang Wu, Jerry Chun-wei Lin
- LIV: Language-image Representations And Rewards For Robotic Control Yecheng Jason Ma et al.
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- Editing Large Language Models: Problems, Methods, And Opportunities Yunzhi Yao et al.
- Chatdoctor: A Medical Chat Model Fine-tuned On A Large Language Model Meta-ai (llama) Using Medical Domain Knowledge Yunxiang Li et al.
- Describe, Explain, Plan And Select: Interactive Planning With Large Language Models Enables Open-world Multi-task Agents Zihao Wang et al.
- Towards Open-world Recommendation With Knowledge Augmentation From Large Language Models Yunjia Xi et al.
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Tool Learning With Foundation Models Yujia Qin et al.
- Exploring The Impact Of Instruction Data Scaling On Large Language Models: An Empirical Study On Real-world Use Cases Yunjie Ji et al.
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Toolqa: A Dataset For LLM Question Answering With External Tools Yuchen Zhuang, Yue Yu, Kuan Wang, Haotian Sun, Chao Zhang
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Is Chatgpt A Good Sentiment Analyzer? A Preliminary Study Zengzhi Wang et al.
- Fine-grained Human Feedback Gives Better Rewards For Language Model Training Zeqiu Wu et al.
- Large Language Models In Healthcare And Medical Domain: A Review Zabir Al Nazi, Wei Peng
- Monitoring Ai-modified Content At Scale: A Case Study On The Impact Of Chatgpt On AI Conference Peer Reviews Weixin Liang et al.
- Assessing AI Detectors In Identifying Ai-generated Code: Implications For Education Wei Hung Pan et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Towards Conversational Diagnostic AI Tao Tu et al.
- Who Validates The Validators? Aligning Llm-assisted Evaluation Of LLM Outputs With Human Preferences Shreya Shankar, J. D. Zamfirescu-pereira, Björn Hartmann, Aditya G. Parameswaran, Ian Arawjo
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- An Empirical Study On Usage And Perceptions Of Llms In A Software Engineering Project Sanka Rasnayaka, Guanlin Wang, Ridwan Shariffdeen, Ganesh Neelakanta Iyer
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- Beyond Code Generation: An Observational Study Of Chatgpt Usage In Software Engineering Practice Ranim Khojah, Mazen Mohamad, Philipp Leitner, Francisco Gomes De Oliveira Neto
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- Ai-augmented Brainwriting: Investigating The Use Of Llms In Group Ideation Orit Shaer, Angelora Cooper, Osnat Mokryn, Andrew L. Kun, Hagit Ben Shoshan
- Jamba: A Hybrid Transformer-mamba Language Model Opher Lieber et al.
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- CBR-RAG: Case-based Reasoning For Retrieval Augmented Generation In Llms For Legal Question Answering Nirmalie Wiratunga et al.
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- History Of Generative Artificial Intelligence (AI) Chatbots: Past, Present, And Future Development Md. Al-amin et al.
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- What Large Language Models Know And What People Think They Know Mark Steyvers et al.
- A Piece Of Theatre: Investigating How Teachers Design LLM Chatbots To Assist Adolescent Cyberbullying Education Michael A. Hedderich et al.
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Language Models For Code Completion: A Practical Evaluation Maliheh Izadi et al.
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Supporting Sensemaking Of Large Language Model Outputs At Scale Katy Ilonka Gero, Chelse Swoopes, Ziwei Gu, Jonathan K. Kummerfeld, Elena L. Glassman
- The Dawn After The Dark: An Empirical Study On Factuality Hallucination In Large Language Models Junyi Li et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Clochat: Understanding How People Customize, Interact, And Experience Personas In Large Language Models Juhye Ha, Hyeon Jeon, Daeun Han, Jinwook Seo, Changhoon Oh
- Feedback-generation For Programming Exercises With GPT-4 Imen Azaiz, Natalie Kiesler, Sven Strickroth
- (A)I Am Not A Lawyer, But...: Engaging Legal Experts Towards Responsible LLM Policies For Legal Advice Inyoung Cheong, King Xia, K. J. Kevin Feng, Quan Ze Chen, Amy X. Zhang
- Fine Tuning Vs. Retrieval Augmented Generation For Less Popular Knowledge Heydar Soudani, Evangelos Kanoulas, Faegheh Hasibi
- Gemma: Open Models Based On Gemini Research And Technology Gemma Team et al.
- Gemma 2: Improving Open Language Models At A Practical Size Gemma Team et al.
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context Gemini Team et al.
- Large Language Models In Cybersecurity: State-of-the-art Farzad Nourmohammadzadeh Motlagh et al.
- Understanding The Impact Of Long-term Memory On Self-disclosure With Large Language Model-driven Chatbots For Public Health Intervention Eunkyung Jo, Yuin Jeong, Sohyun Park, Daniel A. Epstein, Young-ho Kim
- Embedding Large Language Models Into Extended Reality: Opportunities And Challenges For Inclusion, Engagement, And Privacy Efe Bozkir et al.
- Deepseek-v2: A Strong, Economical, And Efficient Mixture-of-experts Language Model Deepseek-ai et al.
- Generative AI In EU Law: Liability, Privacy, Intellectual Property, And Cybersecurity Claudio Novelli, Federico Casolari, Philipp Hacker, Giorgio Spedicato, Luciano Floridi
- Open Source Language Models Can Provide Feedback: Evaluating Llms' Ability To Help Students Using Gpt-4-as-a-judge Charles Koutcheme et al.
- Rethinking Interpretability In The Era Of Large Language Models Chandan Singh, Jeevana Priya Inala, Michel Galley, Rich Caruana, Jianfeng Gao
- MM1: Methods, Analysis & Insights From Multimodal LLM Pre-training Brandon Mckinzie et al.
- Homogenization Effects Of Large Language Models On Human Creative Ideation Barrett R. Anderson, Jash Hemant Shah, Max Kreminski
- Large Language Models And User Trust: Consequence Of Self-referential Learning Loop And The Deskilling Of Healthcare Professionals Avishek Choudhury, Zaria Chaudhry
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- AI And Memory Wall Amir Gholami et al.
- Optimization Methods For Personalizing Large Language Models Through Retrieval Augmentation Alireza Salemi, Surya Kallumadi, Hamed Zamani
- Does Fine-tuning Llms On New Knowledge Encourage Hallucinations? Zorik Gekhman et al.
- Harnessing Large Language Models For Text-rich Sequential Recommendation Zhi Zheng, Wenshuo Chao, Zhaopeng Qiu, Hengshu Zhu, Hui Xiong
- Large Language Models For Data Annotation And Synthesis: A Survey Zhen Tan et al.
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites Zhe Chen et al.
- Let Me Do It For You: Towards LLM Empowered Recommendation Via Tool Learning Yuyue Zhao et al.
- Autocoderover: Autonomous Program Improvement Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, Abhik Roychoudhury
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Sora: A Review On Background, Technology, Limitations, And Opportunities Of Large Vision Models Yixin Liu et al.
- How Johnny Can Persuade Llms To Jailbreak Them: Rethinking Persuasion To Challenge AI Safety By Humanizing Llms Yi Zeng et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- A Review Of Modern Recommender Systems Using Generative Models (gen-recsys) Yashar Deldjoo et al.
- Datasets For Large Language Models: A Comprehensive Survey Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Searching For Best Practices In Retrieval-augmented Generation Xiaohua Wang et al.
- A Survey On RAG Meeting Llms: Towards Retrieval-augmented Large Language Models Wenqi Fan et al.
- CRUD-RAG: A Comprehensive Chinese Benchmark For Retrieval-augmented Generation Of Large Language Models Yuanjie Lyu et al.
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
- Can Generative Llms Create Query Variants For Test Collections? An Exploratory Study Marwah Alaofi, Luke Gallagher, Mark Sanderson, Falk Scholer, Paul Thomas
- Deepseek-r1: Incentivizing Reasoning Capability In Llms Via Reinforcement Learning Deepseek-ai et al.
🏷 Responsible AI
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- Build It Break It Fix It For Dialogue Safety: Robustness From Adversarial Human Attack Emily Dinan, Samuel Humeau, Bharath Chintagunta, Jason Weston
- Recipes For Safety In Open-domain Chatbots Jing Xu et al.
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- On The Safety Of Conversational Models: Taxonomy, Dataset, And Benchmark Hao Sun et al.
- Challenges In Detoxifying Language Models Johannes Welbl et al.
- Evaluating Large Language Models Trained On Code Mark Chen et al.
- Scaling Language Models: Methods, Analysis & Insights From Training Gopher Jack W. Rae et al.
- When To Make Exceptions: Exploring Language Models As Accounts Of Human Moral Judgment Zhijing Jin et al.
- Teaching Language Models To Support Answers With Verified Quotes Jacob Menick et al.
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- Blenderbot 3: A Deployed Conversational Agent That Continually Learns To Responsibly Engage Kurt Shuster et al.
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- BLOOM: A 176b-parameter Open-access Multilingual Language Model Bigscience Workshop et al.
- Selection-inference: Exploiting Large Language Models For Interpretable Logical Reasoning Antonia Creswell, Murray Shanahan, Irina Higgins
- No Language Left Behind: Scaling Human-centered Machine Translation Nllb Team et al.
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- The Bigscience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Hugo Laurençon et al.
- Llama 2: Open Foundation And Fine-tuned Chat Models Hugo Touvron et al.
- Palm 2 Technical Report Rohan Anil et al.
- How Secure Is Code Generated By Chatgpt? Raphaël Khoury, Anderson R. Avila, Jacob Brunelle, Baba Mamadou Camara
- Designerly Understanding: Information Needs For Model Transparency To Support Design Ideation For Ai-powered User Experience Q. Vera Liao, Hariharan Subramonyam, Jennifer Wang, Jennifer Wortman Vaughan
- AI Transparency In The Age Of Llms: A Human-centered Research Roadmap Q. Vera Liao, Jennifer Wortman Vaughan
- Starcoder: May The Source Be With You! Raymond Li et al.
- Hallucinations In Large Multilingual Translation Models Nuno M. Guerreiro et al.
- Jais And Jais-chat: Arabic-centric Foundation And Instruction-tuned Open Generative Large Language Models Neha Sengupta et al.
- Open Sesame! Universal Black Box Jailbreaking Of Large Language Models Raz Lapid, Ron Langberg, Moshe Sipper
- Red Teaming Chatgpt Via Jailbreaking: Bias, Robustness, Reliability And Toxicity Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Languagempc: Large Language Models As Decision Makers For Autonomous Driving Hao Sha et al.
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- Do We Still Need Clinical Language Models? Eric Lehman et al.
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Adapted Large Language Models Can Outperform Medical Experts In Clinical Text Summarization Dave Van Veen et al.
- Almanac: Retrieval-augmented Language Models For Clinical Medicine Cyril Zakka et al.
- Receive, Reason, And React: Drive As You Say With Large Language Models In Autonomous Vehicles Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang
- Chatgpt And A New Academic Reality: Artificial Intelligence-written Research Papers And The Ethics Of The Large Language Models In Scholarly Publishing Brady Lund et al.
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Toxicity In Chatgpt: Analyzing Persona-assigned Language Models Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Chatgpt: More Than A Weapon Of Mass Deception, Ethical Challenges And Responses From The Human-centered Artificial Intelligence (HCAI) Perspective Alejo Jose G. Sison, Marco Tulio Daza, Roberto Gozalo-brizuela, Eduardo C. Garrido-merchán
- Better To Ask In English: Cross-lingual Evaluation Of Large Language Models For Healthcare Queries Yiqiao Jin et al.
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Mapping The Ethics Of Generative AI: A Comprehensive Scoping Review Thilo Hagendorff
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Gemma: Open Models Based On Gemini Research And Technology Gemma Team et al.
- Gemini Goes To Med School: Exploring The Capabilities Of Multimodal Large Language Models On Medical Challenge Problems & Hallucinations Ankit Pal, Malaikannan Sankarasubbu
- Quality Of Answers Of Generative Large Language Models Vs Peer Patients For Interpreting Lab Test Results For Lay Patients: Evaluation Study Zhe He et al.
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- How Johnny Can Persuade Llms To Jailbreak Them: Rethinking Persuasion To Challenge AI Safety By Humanizing Llms Yi Zeng et al.
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
🏷 Scaling Laws
- Scaling Laws For Neural Language Models Jared Kaplan et al.
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Revisiting Neural Scaling Laws In Language And Vision Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- Vindlu: A Recipe For Effective Video-and-language Pretraining Feng Cheng et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Emergent And Predictable Memorization In Large Language Models Stella Biderman et al.
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- Navgpt: Explicit Reasoning In Vision-and-language Navigation With Large Language Models Gengze Zhou, Yicong Hong, Qi Wu
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- AI And Memory Wall Amir Gholami et al.
🏷 Security
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- Adversarial Learning For Neural Dialogue Generation Jiwei Li et al.
- DP-GAN: Diversity-promoting Generative Adversarial Network For Generating Informative And Diversified Text Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu Sun
- Maskgan: Better Text Generation Via Filling In The______ William Fedus, Ian Goodfellow, Andrew M. Dai
- Dialogue Generation: From Imitation Learning To Inverse Reinforcement Learning Ziming Li, Julia Kiseleva, Maarten De Rijke
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Adversarially Regularising Neural NLI Models To Integrate Logical Background Knowledge Pasquale Minervini, Sebastian Riedel
- Adversarial Over-sensitivity And Over-stability Strategies For Dialogue Models Tong Niu, Mohit Bansal
- Language Gans Falling Short Massimo Caccia et al.
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Retrieval-enhanced Adversarial Training For Neural Response Generation Qingfu Zhu, Lei Cui, Weinan Zhang, Furu Wei, Ting Liu
- Generating Informative And Diverse Conversational Responses Via Adversarial Information Maximization Yizhe Zhang et al.
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Evaluating Text Gans As Language Models Guy Tevet, Gavriel Habib, Vered Shwartz, Jonathan Berant
- Generating Persona Consistent Dialogues By Exploiting Natural Language Inference Haoyu Song, Wei-nan Zhang, Jingwen Hu, Ting Liu
- Say What I Want: Towards The Dark Side Of Neural Dialogue Models Haochen Liu, Tyler Derr, Zitao Liu, Jiliang Tang
- Universal Adversarial Triggers For Attacking And Analyzing NLP Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh
- Empdg: Multiresolution Interactive Empathetic Dialogue Generation Qintong Li et al.
- Learning To Retrieve Reasoning Paths Over Wikipedia Graph For Question Answering Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, Caiming Xiong
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Winogrande: An Adversarial Winograd Schema Challenge At Scale Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- On The Use Of BERT For Neural Machine Translation Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
- Build It Break It Fix It For Dialogue Safety: Robustness From Adversarial Human Attack Emily Dinan, Samuel Humeau, Bharath Chintagunta, Jason Weston
- Bertscore: Evaluating Text Generation With BERT Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, Yoav Artzi
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- ACUTE-EVAL: Improved Dialogue Evaluation With Optimized Questions And Multi-turn Comparisons Margaret Li, Jason Weston, Stephen Roller
- What Does BERT Learn From Multiple-choice Reading Comprehension Datasets? Chenglei Si, Shuohang Wang, Min-yen Kan, Jing Jiang
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Contextualized Perturbation For Textual Adversarial Attack Dianqi Li et al.
- A Simple But Tough-to-beat Data Augmentation Approach For Natural Language Understanding And Generation Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- A Simple Language Model For Task-oriented Dialogue Ehsan Hosseini-asl, Bryan Mccann, Chien-sheng Wu, Semih Yavuz, Richard Socher
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- Better Fine-tuning By Reducing Representational Collapse Armen Aghajanyan et al.
- Generative Data Augmentation For Commonsense Reasoning Yiben Yang et al.
- Syntactic Data Augmentation Increases Robustness To Inference Heuristics Junghyun Min, R. Thomas Mccoy, Dipanjan Das, Emily Pitler, Tal Linzen
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- Contrastive Code Representation Learning Paras Jain et al.
- The Effect Of Natural Distribution Shift On Question Answering Models John Miller, Karl Krauth, Benjamin Recht, Ludwig Schmidt
- Simplifying Paragraph-level Question Generation Via Transformer Language Models Luis Enrico Lopez, Diane Kathryn Cruz, Jan Christian Blaise Cruz, Charibeth Cheng
- Robust Encodings: A Framework For Combating Adversarial Typos Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Trojaning Language Models For Fun And Profit Xinyang Zhang, Zheng Zhang, Shouling Ji, Ting Wang
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- An Empirical Study On Robustness To Spurious Correlations Using Pre-trained Language Models Lifu Tu, Garima Lalwani, Spandana Gella, He He
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Mitigating Gender Bias For Neural Dialogue Generation With Adversarial Learning Haochen Liu et al.
- BERT Loses Patience: Fast And Robust Inference With Early Exit Wangchunshu Zhou et al.
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- Defending Against Backdoor Attacks In Natural Language Generation Xiaofei Sun et al.
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- Evaluating The Robustness Of Neural Language Models To Input Perturbations Milad Moradi, Matthias Samwald
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- How Should Pre-trained Language Models Be Fine-tuned Towards Adversarial Robustness? Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- The Power Of Scale For Parameter-efficient Prompt Tuning Brian Lester, Rami Al-rfou, Noah Constant
- Adversarial GLUE: A Multi-task Benchmark For Robustness Evaluation Of Language Models Boxin Wang et al.
- Dynaboard: An Evaluation-as-a-service Platform For Holistic Next-generation Benchmarking Zhiyi Ma et al.
- Revisiting Self-training For Few-shot Learning Of Language Model Yiming Chen et al.
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- Using Adversarial Attacks To Reveal The Statistical Bias In Machine Reading Comprehension Models Jieyu Lin, Jiajie Zou, Nai Ding
- Improving Question Answering Model Robustness With Synthetic Adversarial Data Generation Max Bartolo et al.
- Evaluating Large Language Models Trained On Code Mark Chen et al.
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Exploring Visual Prompts For Adapting Large-scale Models Hyojin Bahng, Ali Jahanian, Swami Sankaranarayanan, Phillip Isola
- Prompt Tuning For Generative Multimodal Pretrained Models Hao Yang et al.
- Lost At C: A User Study On The Security Implications Of Large Language Model Code Assistants Gustavo Sandoval et al.
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Are Large Pre-trained Language Models Leaking Your Personal Information? Jie Huang, Hanyin Shao, Kevin Chen-chuan Chang
- Maieutic Prompting: Logically Consistent Reasoning With Recursive Explanations Jaehun Jung et al.
- Teaching Language Models To Support Answers With Verified Quotes Jacob Menick et al.
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- Language Models Are Realistic Tabular Data Generators Vadim Borisov, Kathrin Seßler, Tobias Leemann, Martin Pawelczyk, Gjergji Kasneci
- Shortcut Learning Of Large Language Models In Natural Language Understanding Mengnan Du, Fengxiang He, Na Zou, Dacheng Tao, Xia Hu
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Protoclip: Prototypical Contrastive Language Image Pretraining Delong Chen et al.
- LAION-5B: An Open Large-scale Dataset For Training Next Generation Image-text Models Christoph Schuhmann et al.
- Complexity-based Prompting For Multi-step Reasoning Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, Tushar Khot
- Multi-lingual Evaluation Of Code Generation Models Ben Athiwaratkun et al.
- St-moe: Designing Stable And Transferable Sparse Expert Models Barret Zoph et al.
- Qaner: Prompting Question Answering Models For Few-shot Named Entity Recognition Andy T. Liu et al.
- Improving Alignment Of Dialogue Agents Via Targeted Human Judgements Amelia Glaese et al.
- Commonsenseqa 2.0: Exposing The Limits Of AI Through Gamification Alon Talmor et al.
- WANLI: Worker And AI Collaboration For Natural Language Inference Dataset Creation Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi
- Storydall-e: Adapting Pretrained Text-to-image Transformers For Story Continuation Adyasha Maharana, Darryl Hannan, Mohit Bansal
- LASP: Text-to-text Optimization For Language-aware Soft Prompting Of Vision & Language Models Adrian Bulat, Georgios Tzimiropoulos
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- Toxigen: A Large-scale Machine-generated Dataset For Adversarial And Implicit Hate Speech Detection Thomas Hartvigsen et al.
- Holistic Evaluation Of Language Models Percy Liang et al.
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Can Chatgpt Replace Stackoverflow? A Study On Robustness And Reliability Of Large Language Model Code Generation Li Zhong, Zilong Wang
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- Towards Expert-level Medical Question Answering With Large Language Models Karan Singhal et al.
- Aligning Instruction Tasks Unlocks Large Language Models As Zero-shot Relation Extractors Kai Zhang, Bernal Jiménez Gutiérrez, Yu Su
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Backdooring Instruction-tuned Large Language Models With Virtual Prompt Injection Jun Yan et al.
- Spear Phishing With Large Language Models Julian Hazell
- Jatmo: Prompt Injection Defense By Task-specific Finetuning Julien Piet et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- Benchmarking Large Language Models In Retrieval-augmented Generation Jiawei Chen, Hongyu Lin, Xianpei Han, Le Sun
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- LLM Lies: Hallucinations Are Not Bugs, But Features As Adversarial Examples Jia-yu Yao et al.
- Chatgpt To Replace Crowdsourcing Of Paraphrases For Intent Classification: Higher Diversity And Comparable Model Robustness Jan Cegin, Jakub Simko, Peter Brusilovsky
- Fake News In Sheep's Clothing: Robust Fake News Detection Against Llm-empowered Style Attacks Jiaying Wu, Jiafeng Guo, Bryan Hooi
- Generating Phishing Attacks Using Chatgpt Sayak Saha Roy, Krishna Vamsi Naragam, Shirin Nilizadeh
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Audiogpt: Understanding And Generating Speech, Music, Sound, And Talking Head Rongjie Huang et al.
- How Secure Is Code Generated By Chatgpt? Raphaël Khoury, Anderson R. Avila, Jacob Brunelle, Baba Mamadou Camara
- Prompting The Hidden Talent Of Web-scale Speech Models For Zero-shot Task Generalization Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath
- Large Language Models Sensitivity To The Order Of Options In Multiple-choice Questions Pouya Pezeshkpour, Estevam Hruschka
- Are Aligned Neural Networks Adversarially Aligned? Nicholas Carlini et al.
- Benefits And Harms Of Large Language Models In Digital Mental Health Munmun De Choudhury, Sachin R. Pendse, Neha Kumar
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- State Of What Art? A Call For Multi-prompt LLM Evaluation Moran Mizrahi et al.
- Open Sesame! Universal Black Box Jailbreaking Of Large Language Models Raz Lapid, Ron Langberg, Moshe Sipper
- Scalable Extraction Of Training Data From (production) Language Models Milad Nasr et al.
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Large Language Model Alignment: A Survey Tianhao Shen et al.
- Red Teaming Chatgpt Via Jailbreaking: Bias, Robustness, Reliability And Toxicity Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing
- Observations On Llms For Telecom Domain: Capabilities And Limitations Sumit Soman, Ranjani H G
- Pretraining Language Models With Human Preferences Tomasz Korbak et al.
- The Troubling Emergence Of Hallucination In Large Language Models -- An Extensive Definition, Quantification, And Prescriptive Remediations Vipula Rawte et al.
- Is GPT-4 A Reliable Rater? Evaluating Consistency In GPT-4 Text Ratings Veronika Hackl, Alexandra Elena Müller, Michael Granitzer, Maximilian Sailer
- Can Ai-generated Text Be Reliably Detected? Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, Soheil Feizi
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Ferret: Refer And Ground Anything Anywhere At Any Granularity Haoxuan You et al.
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- Exploiting Programmatic Behavior Of Llms: Dual-use Through Standard Security Attacks Daniel Kang et al.
- Can Large Language Models Be An Alternative To Human Evaluations? Cheng-han Chiang, Hung-yi Lee
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- Llmseceval: A Dataset Of Natural Language Prompts For Security Evaluations Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, Riccardo Scandariato
- How Close Is Chatgpt To Human Experts? Comparison Corpus, Evaluation, And Detection Biyang Guo et al.
- Refactoring Programs Using Large Language Models With Few-shot Examples Atsushi Shirafuji, Yusuke Oda, Jun Suzuki, Makoto Morishita, Yutaka Watanobe
- Universal And Transferable Adversarial Attacks On Aligned Language Models Andy Zou et al.
- How Good Are GPT Models At Machine Translation? A Comprehensive Evaluation Amr Hendy et al.
- The (ab)use Of Open Source Code To Train Large Language Models Ali Al-kaswan, Maliheh Izadi
- A Categorical Archive Of Chatgpt Failures Ali Borji
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Smoothllm: Defending Large Language Models Against Jailbreaking Attacks Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas
- Enhancing Job Recommendation Through Llm-based Generative Adversarial Networks Yingpeng Du et al.
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- In Chatgpt We Trust? Measuring And Characterizing The Reliability Of Chatgpt Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- How Robust Is GPT-3.5 To Predecessors? A Comprehensive Study On Language Understanding Tasks Xuanting Chen et al.
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Is Chatgpt A Good Translator? Yes With GPT-4 As The Engine Wenxiang Jiao et al.
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- Mapping The Ethics Of Generative AI: A Comprehensive Scoping Review Thilo Hagendorff
- Large Language Models In Cybersecurity: State-of-the-art Farzad Nourmohammadzadeh Motlagh et al.
- Generative AI In EU Law: Liability, Privacy, Intellectual Property, And Cybersecurity Claudio Novelli, Federico Casolari, Philipp Hacker, Giorgio Spedicato, Luciano Floridi
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Large Language Models In Mental Health Care: A Scoping Review Yining Hua et al.
- How Johnny Can Persuade Llms To Jailbreak Them: Rethinking Persuasion To Challenge AI Safety By Humanizing Llms Yi Zeng et al.
🏷 SLT
- Improving The Transformer Translation Model With Document-level Context Jiacheng Zhang et al.
- Pretrained Language Models For Document-level Neural Machine Translation Liangyou Li, Xin Jiang, Qun Liu
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- On The Use Of BERT For Neural Machine Translation Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- A Simple But Tough-to-beat Data Augmentation Approach For Natural Language Understanding And Generation Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- BERT, Mbert, Or Bibert? A Study On Contextualized Embeddings For Neural Machine Translation Haoran Xu, Benjamin Van Durme, Kenton Murray
🏷 Survey Paper
- Generative Deep Neural Networks For Dialogue: A Short Review Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, Joelle Pineau
- DP-GAN: Diversity-promoting Generative Adversarial Network For Generating Informative And Diversified Text Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu Sun
- Why Are Sequence-to-sequence Models So Dull? Understanding The Low-diversity Problem Of Chatbots Shaojie Jiang, Maarten De Rijke
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Deep Learning Based Chatbot Models Richard Csaky
- A Survey Of Natural Language Generation Techniques With A Focus On Dialogue Systems - Past, Present And Future Directions Sashank Santhanam, Samira Shaikh
- Meaningful Answer Generation Of E-commerce Question-answering Shen Gao, Xiuying Chen, Zhaochun Ren, Dongyan Zhao, Rui Yan
- Compressing Large-scale Transformer-based Models: A Case Study On BERT Prakhar Ganesh et al.
- Machine Reading Comprehension: The Role Of Contextualized Language Models And Beyond Zhuosheng Zhang, Hai Zhao, Rui Wang
- A Survey Of Knowledge-enhanced Text Generation Wenhao Yu et al.
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- An Exploratory Study On Long Dialogue Summarization: What Works And What's Next Yusen Zhang et al.
- Personalized Transformer For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Efficient Large-scale Language Model Training On GPU Clusters Using Megatron-lm Deepak Narayanan et al.
- Pre-train, Prompt, And Predict: A Systematic Survey Of Prompting Methods In Natural Language Processing Pengfei Liu et al.
- A Short Survey Of Pre-trained Language Models For Conversational AI-A Newage In NLP Munazza Zaib, Quan Z. Sheng, Wei Emma Zhang
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- Scale Efficiently: Insights From Pre-training And Fine-tuning Transformers Yi Tay et al.
- Characterchat: Supporting The Creation Of Fictional Characters Through Conversation And Progressive Manifestation With A Chatbot Oliver Schmitt, Daniel Buschek
- Quiz-style Question Generation For News Stories Adam D. Lelkes, Vinh Q. Tran, Cong Yu
- Adapting Language Models For Zero-shot Learning By Meta-tuning On Dataset And Prompt Collections Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein
- Pretrained Language Models For Text Generation: A Survey Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-rong Wen
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- Vision-and-language Pretrained Models: A Survey Siqu Long, Feiqi Cao, Soyeon Caren Han, Haiqin Yang
- A Survey On Retrieval-augmented Text Generation Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
- Reasoning With Language Model Prompting: A Survey Shuofei Qiao et al.
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, Antonios Anastasopoulos, Yulia Tsvetkov
- Coditt5: Pretraining For Source Code And Natural Language Editing Jiyang Zhang, Sheena Panthaplackel, Pengyu Nie, Junyi Jessy Li, Milos Gligoric
- Towards Reasoning In Large Language Models: A Survey Jie Huang, Kevin Chen-chuan Chang
- Language Models As Agent Models Jacob Andreas
- Shortcut Learning Of Large Language Models In Natural Language Understanding Mengnan Du, Fengxiang He, Na Zou, Dacheng Tao, Xia Hu
- The Debate Over Understanding In Ai's Large Language Models Melanie Mitchell, David C. Krakauer
- Vision-language Intelligence: Tasks, Representation Learning, And Large Models Feng Li et al.
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- A Survey On Model Compression And Acceleration For Pretrained Language Models Canwen Xu, Julian Mcauley
- Survey Of Hallucination In Natural Language Generation Ziwei Ji et al.
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- Retrieval Augmentation Of Large Language Models For Lay Language Generation Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- From Image To Language: A Critical Analysis Of Visual Question Answering (VQA) Approaches, Challenges, And Opportunities Md Farhan Ishmam, Md Sakib Hossain Shovon, M. F. Mridha, Nilanjan Dey
- Gptaraeval: A Comprehensive Evaluation Of Chatgpt On Arabic NLP Md Tawkat Islam Khondaker, Abdul Waheed, El Moatez Billah Nagoudi, Muhammad Abdul-mageed
- Co-writing With Opinionated Language Models Affects Users' Views Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, Mor Naaman
- Natural Language Generation And Understanding Of Big Code For Ai-assisted Programming: A Review Man Fai Wong, Shangxin Guo, Ching Nam Hang, Siu Wai Ho, Chee Wei Tan
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Practical And Ethical Challenges Of Large Language Models In Education: A Systematic Scoping Review Lixiang Yan et al.
- A Bibliometric Review Of Large Language Models Research From 2017 To 2023 Lizhou Fan et al.
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- Do Llms Exhibit Human-like Response Biases? A Case Study In Survey Design Lindia Tjuatja, Valerie Chen, Sherry Tongshuang Wu, Ameet Talwalkar, Graham Neubig
- Automatically Correcting Large Language Models: Surveying The Landscape Of Diverse Self-correction Strategies Liangming Pan et al.
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- A Survey On Hallucination In Large Language Models: Principles, Taxonomy, Challenges, And Open Questions Lei Huang et al.
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- Ai-augmented Surveys: Leveraging Large Language Models And Surveys For Opinion Prediction Junsol Kim, Byungkyu Lee
- Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, Chun Zuo
- LEXTREME: A Multi-lingual And Multi-task Benchmark For The Legal Domain Joel Niklaus et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- When Large Language Models Meet Personalization: Perspectives Of Challenges And Opportunities Jin Chen et al.
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- Large Language Models In Medicine: The Potentials And Pitfalls Jesutofunmi A. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- "it's Not Like Jarvis, But It's Pretty Close!" -- Examining Chatgpt's Usage Among Undergraduate Students In Computer Science Ishika Joshi, Ritvik Budhiraja, Harshal D Akolekar, Jagat Sesh Challa, Dhruv Kumar
- A Comprehensive Overview Of Large Language Models Humza Naveed et al.
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Instruction Tuning For Large Language Models: A Survey Shengyu Zhang et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- Medalign: A Clinician-generated Dataset For Instruction Following With Electronic Medical Records Scott L. Fleming et al.
- Are Emergent Abilities Of Large Language Models A Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo
- The Science Of Detecting Llm-generated Texts Ruixiang Tang, Yu-neng Chuang, Xia Hu
- Gpteval: A Survey On Assessments Of Chatgpt And GPT-4 Rui Mao, Guanyi Chen, Xulang Zhang, Frank Guerin, Erik Cambria
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Beyond Memorization: Violating Privacy Via Inference With Large Language Models Robin Staab, Mark Vero, Mislav Balunović, Martin Vechev
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- Evaluation Of Chatgpt-generated Medical Responses: A Systematic Review And Meta-analysis Qiuhong Wei et al.
- Can Large Language Models Replace Humans In The Systematic Review Process? Evaluating Gpt-4's Efficacy In Screening And Extracting Data From Peer-reviewed And Grey Literature In Multiple Languages Qusai Khraisha, Sophie Put, Johanna Kappenberg, Azza Warraitch, Kristin Hadfield
- Students' Perceptions And Preferences Of Generative Artificial Intelligence Feedback For Programming Zhengdong Zhang et al.
- Pre-train, Prompt And Recommendation: A Comprehensive Survey Of Language Modelling Paradigm Adaptations In Recommender Systems Peng Liu, Lemei Zhang, Jon Atle Gulla
- Bridging The Gap: A Survey On Integrating (human) Feedback For Natural Language Generation Patrick Fernandes et al.
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Large Language Model Alignment: A Survey Tianhao Shen et al.
- Cognitive Architectures For Language Agents Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Creativity Support In The Age Of Large Language Models: An Empirical Study Involving Emerging Writers Tuhin Chakrabarty, Vishakh Padmakumar, Faeze Brahman, Smaranda Muresan
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Augmented Language Models: A Survey Grégoire Mialon et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Simulating H.P. Lovecraft Horror Literature With The Chatgpt Large Language Model Eduardo C. Garrido-merchán, José Luis Arroyo-barrigüete, Roberto Gozalo-brizuela
- Large Language Models For Generative Information Extraction: A Survey Derong Xu et al.
- Multimodal Foundation Models: From Specialists To General-purpose Assistants Chunyuan Li et al.
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Large Language Models On Graphs: A Comprehensive Survey Bowen Jin et al.
- A Short Survey Of Viewing Large Language Models In Legal Aspect Zhongxiang Sun
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- Generative AI: Implications And Applications For Education Anastasia Olnancy Olga et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Chatgpt: More Than A Weapon Of Mass Deception, Ethical Challenges And Responses From The Human-centered Artificial Intelligence (HCAI) Perspective Alejo Jose G. Sison, Marco Tulio Daza, Roberto Gozalo-brizuela, Eduardo C. Garrido-merchán
- On Learning To Summarize With Large Language Models As References Yixin Liu et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- A Comprehensive Survey Of Ai-generated Content (AIGC): A History Of Generative AI From GAN To Chatgpt Yihan Cao et al.
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- A Survey On Model Compression For Large Language Models Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang
- Fine-tuning Llama For Multi-stage Text Retrieval Xueguang Ma, Liang Wang, Nan Yang, Furu Wei, Jimmy Lin
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- Aligning Large Language Models With Human: A Survey Yufei Wang et al.
- Large Language Models In Healthcare And Medical Domain: A Review Zabir Al Nazi, Wei Peng
- Monitoring Ai-modified Content At Scale: A Case Study On The Impact Of Chatgpt On AI Conference Peer Reviews Weixin Liang et al.
- Continual Learning For Large Language Models: A Survey Tongtong Wu et al.
- Mapping The Ethics Of Generative AI: A Comprehensive Scoping Review Thilo Hagendorff
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- Large Language Models And Games: A Survey And Roadmap Roberto Gallotta et al.
- Beyond Code Generation: An Observational Study Of Chatgpt Usage In Software Engineering Practice Ranim Khojah, Mazen Mohamad, Philipp Leitner, Francisco Gomes De Oliveira Neto
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- From Text To Transformation: A Comprehensive Review Of Large Language Models' Versatility Pravneet Kaur et al.
- AI Hallucinations: A Misnomer Worth Clarifying Negar Maleki, Balaji Padmanabhan, Kaushik Dutta
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- History Of Generative Artificial Intelligence (AI) Chatbots: Past, Present, And Future Development Md. Al-amin et al.
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- (A)I Am Not A Lawyer, But...: Engaging Legal Experts Towards Responsible LLM Policies For Legal Advice Inyoung Cheong, King Xia, K. J. Kevin Feng, Quan Ze Chen, Amy X. Zhang
- Large Language Models In Cybersecurity: State-of-the-art Farzad Nourmohammadzadeh Motlagh et al.
- Ai-tutoring In Software Engineering Education Eduard Frankford, Clemens Sauerwein, Patrick Bassner, Stephan Krusche, Ruth Breu
- The Revolution Of Multimodal Large Language Models: A Survey Davide Caffagni et al.
- Generative AI In Education: A Study Of Educators' Awareness, Sentiments, And Influencing Factors Aashish Ghimire, James Prather, John Edwards
- Large Language Models For Data Annotation And Synthesis: A Survey Zhen Tan et al.
- A Survey On Lora Of Large Language Models Yuren Mao et al.
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- Sora: A Review On Background, Technology, Limitations, And Opportunities Of Large Vision Models Yixin Liu et al.
- Large Language Models In Mental Health Care: A Scoping Review Yining Hua et al.
- A Review Of Modern Recommender Systems Using Generative Models (gen-recsys) Yashar Deldjoo et al.
- Datasets For Large Language Models: A Comprehensive Survey Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
- A Survey On RAG Meeting Llms: Towards Retrieval-augmented Large Language Models Wenqi Fan et al.
🏷 TACL
🏷 Time Series
- Fairseq: A Fast, Extensible Toolkit For Sequence Modeling Myle Ott et al.
- Variational Transformers For Diverse Response Generation Zhaojiang Lin, Genta Indra Winata, Peng Xu, Zihan Liu, Pascale Fung
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- Vindlu: A Recipe For Effective Video-and-language Pretraining Feng Cheng et al.
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- Longnet: Scaling Transformers To 1,000,000,000 Tokens Jiayu Ding et al.
- Self-chained Image-language Model For Video Localization And Question Answering Shoubin Yu, Jaemin Cho, Prateek Yadav, Mohit Bansal
- Large Language Models Are Zero-shot Time Series Forecasters Nate Gruver, Marc Finzi, Shikai Qiu, Andrew Gordon Wilson
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Emergent And Predictable Memorization In Large Language Models Stella Biderman et al.
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- Language Models Represent Space And Time Wes Gurnee, Max Tegmark
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- Contextual AI Journaling: Integrating LLM And Time Series Behavioral Sensing Technology To Promote Self-reflection And Well-being Using The Mindscape App Subigya Nepal et al.
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
🏷 Tokenization
- Bridging The Gap For Tokenizer-free Language Models Dokook Choe, Rami Al-rfou, Mandy Guo, Heeyoung Lee, Noah Constant
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- Automated Source Code Generation And Auto-completion Using Deep Learning: Comparing And Discussing Current Language-model-related Approaches Juan Cruz-benito, Sanjay Vishwakarma, Francisco Martin-fernandez, Ismael Faro
- Wangchanberta: Pretraining Transformer-based Thai Language Models Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai, Sarana Nutanong
- Trankit: A Light-weight Transformer-based Toolkit For Multilingual Natural Language Processing Minh Van Nguyen, Viet Dac Lai, Amir Pouran Ben Veyseh, Thien Huu Nguyen
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Leveraging Large Language Models For Multiple Choice Question Answering Joshua Robinson, Christopher Michael Rytting, David Wingate
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- Video-llava: Learning United Visual Representation By Alignment Before Projection Bin Lin et al.
- Language Model Tokenizers Introduce Unfairness Between Languages Aleksandar Petrov, Emanuele La Malfa, Philip H. S. Torr, Adel Bibi
🏷 Tools
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- Topic Aware Neural Response Generation Chen Xing et al.
- A User Simulator For Task-completion Dialogues Xiujun Li et al.
- Neural Personalized Response Generation As Domain Adaptation Weinan Zhang, Ting Liu, Yifa Wang, Qingfu Zhu
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- Latent Intention Dialogue Models Tsung-hsien Wen, Yishu Miao, Phil Blunsom, Steve Young
- End-to-end Optimization Of Goal-driven And Visually Grounded Dialogue Systems Florian Strub et al.
- A Unified Query-based Generative Model For Question Generation And Question Answering Linfeng Song, Zhiguo Wang, Wael Hamza
- Parlai: A Dialog Research Software Platform Alexander H. Miller et al.
- Dialogue Generation: From Imitation Learning To Inverse Reinforcement Learning Ziming Li, Julia Kiseleva, Maarten De Rijke
- Extending Neural Generative Conversational Model Using External Knowledge Sources Prasanna Parthasarathi, Joelle Pineau
- A Retrieve-and-edit Framework For Predicting Structured Outputs Tatsunori B. Hashimoto, Kelvin Guu, Yonatan Oren, Percy Liang
- Babyai: A Platform To Study The Sample Efficiency Of Grounded Language Learning Maxime Chevalier-boisvert et al.
- Advancing The State Of The Art In Open Domain Dialog Systems Through The Alexa Prize Chandra Khatri et al.
- Skeleton-to-response: Dialogue Generation Guided By Retrieval Memory Deng Cai et al.
- Language Gans Falling Short Massimo Caccia et al.
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Retrieval-enhanced Adversarial Training For Neural Response Generation Qingfu Zhu, Lei Cui, Weinan Zhang, Furu Wei, Ting Liu
- Controllable Neural Story Plot Generation Via Reward Shaping Pradyumna Tambwekar et al.
- Training Tips For The Transformer Model Martin Popel, Ondřej Bojar
- Tensor2tensor For Neural Machine Translation Ashish Vaswani et al.
- Generating Informative And Diverse Conversational Responses Via Adversarial Information Maximization Yizhe Zhang et al.
- "bilingual Expert" Can Find Translation Errors Kai Fan et al.
- Conversational AI: The Science Behind The Alexa Prize Ashwin Ram et al.
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Entity-consistent End-to-end Task-oriented Dialogue System With KB Retriever Libo Qin et al.
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Nemo: A Toolkit For Building AI Applications Using Neural Modules Oleksii Kuchaiev et al.
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Non-monotonic Sequential Text Generation Sean Welleck, Kianté Brantley, Hal Iii Daumé, Kyunghyun Cho
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- Sticking To The Facts: Confident Decoding For Faithful Data-to-text Generation Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh
- Approximating Interactive Human Evaluation With Self-play For Open-domain Dialog Systems Asma Ghandeharioun et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- UER: An Open-source Toolkit For Pre-training Models Zhe Zhao et al.
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Multi-step Retriever-reader Interaction For Scalable Open-domain Question Answering Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew Mccallum
- Pythia: Ai-assisted Code Completion System Alexey Svyatkovskiy, Ying Zhao, Shengyu Fu, Neel Sundaresan
- Empdg: Multiresolution Interactive Empathetic Dialogue Generation Qintong Li et al.
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Modeling Graph Structure In Transformer For Better Amr-to-text Generation Jie Zhu et al.
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- Exbert: A Visual Analysis Tool To Explore Learned Representations In Transformers Models Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Learning From Explanations With Neural Execution Tree Ziqi Wang et al.
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Reinforced Dynamic Reasoning For Conversational Question Generation Boyuan Pan, Hao Li, Ziyu Yao, Deng Cai, Huan Sun
- 12-in-1: Multi-task Vision And Language Representation Learning Jiasen Lu, Vedanuj Goswami, Marcus Rohrbach, Devi Parikh, Stefan Lee
- Controlling The Output Length Of Neural Machine Translation Surafel Melaku Lakew, Mattia Di Gangi, Marcello Federico
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- MLQA: Evaluating Cross-lingual Extractive Question Answering Patrick Lewis, Barlas Oğuz, Ruty Rinott, Sebastian Riedel, Holger Schwenk
- Adapting And Evaluating A Deep Learning Language Model For Clinical Why-question Answering Andrew Wen, Mohamed Y. Elwazir, Sungrim Moon, Jungwei Fan
- Generating Empathetic Responses By Looking Ahead The User's Sentiment Jamin Shin, Peng Xu, Andrea Madotto, Pascale Fung
- GLTR: Statistical Detection And Visualization Of Generated Text Sebastian Gehrmann, Hendrik Strobelt, Alexander M. Rush
- Juice: A Large Scale Distantly Supervised Dataset For Open Domain Context-based Code Generation Rajas Agashe, Srinivasan Iyer, Luke Zettlemoyer
- Using Natural Language For Reward Shaping In Reinforcement Learning Prasoon Goyal, Scott Niekum, Raymond J. Mooney
- Explain Yourself! Leveraging Language Models For Commonsense Reasoning Nazneen Fatema Rajani, Bryan Mccann, Caiming Xiong, Richard Socher
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- A Modular Task-oriented Dialogue System Using A Neural Mixture-of-experts Jiahuan Pei, Pengjie Ren, Maarten De Rijke
- Exploring The Limits Of Transfer Learning With A Unified Text-to-text Transformer Colin Raffel et al.
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- Modelling Hierarchical Structure Between Dialogue Policy And Natural Language Generator With Option Framework For Task-oriented Dialogue System Jianhong Wang, Yuan Zhang, Tae-kyun Kim, Yunjie Gu
- Russiansuperglue: A Russian Language Understanding Evaluation Benchmark Tatiana Shavrina et al.
- GO FIGURE: A Meta Evaluation Of Factuality In Summarization Saadia Gabriel, Asli Celikyilmaz, Rahul Jha, Yejin Choi, Jianfeng Gao
- Knowledge-driven Data Construction For Zero-shot Evaluation In Commonsense Question Answering Kaixin Ma et al.
- WT5?! Training Text-to-text Models To Explain Their Predictions Sharan Narang et al.
- Dense Passage Retrieval For Open-domain Question Answering Vladimir Karpukhin et al.
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- Meaningful Answer Generation Of E-commerce Question-answering Shen Gao, Xiuying Chen, Zhaochun Ren, Dongyan Zhao, Rui Yan
- Fine-tuning Pre-trained Language Model With Weak Supervision: A Contrastive-regularized Self-training Approach Yue Yu et al.
- Enabling Language Models To Fill In The Blanks Chris Donahue, Mina Lee, Percy Liang
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Knowledge Distillation For Improved Accuracy In Spoken Question Answering Chenyu You, Nuo Chen, Yuexian Zou
- Towards Learning A Generic Agent For Vision-and-language Navigation Via Pre-training Weituo Hao, Chunyuan Li, Xiujun Li, Lawrence Carin, Jianfeng Gao
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- Gshard: Scaling Giant Models With Conditional Computation And Automatic Sharding Dmitry Lepikhin et al.
- M3P: Learning Universal Representations Via Multitask Multilingual Multimodal Pre-training Minheng Ni et al.
- Rapidly Bootstrapping A Question Answering Dataset For COVID-19 Raphael Tang et al.
- PLATO-2: Towards Building An Open-domain Chatbot Via Curriculum Learning Siqi Bao et al.
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- Long Range Arena: A Benchmark For Efficient Transformers Yi Tay et al.
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-initiative Conversations Ashwin Paranjape et al.
- Bert-hlstms: BERT And Hierarchical Lstms For Visual Storytelling Jing Su, Qingyun Dai, Frank Guerin, Mian Zhou
- Recipes For Safety In Open-domain Chatbots Jing Xu et al.
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- MEGATRON-CNTRL: Controllable Story Generation With External Knowledge Using Large-scale Language Models Peng Xu et al.
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Coregen: Contextualized Code Representation Learning For Commit Message Generation Lun Yiu Nie et al.
- TRANS-BLSTM: Transformer With Bidirectional LSTM For Language Understanding Zhiheng Huang, Peng Xu, Davis Liang, Ajay Mishra, Bing Xiang
- Intellicode Compose: Code Generation Using Transformer Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan
- Fine-tuning BERT For Schema-guided Zero-shot Dialogue State Tracking Yu-ping Ruan, Zhen-hua Ling, Jia-chen Gu, Quan Liu
- Improving Natural Language Processing Tasks With Human Gaze-guided Neural Attention Ekta Sood, Simon Tannert, Philipp Mueller, Andreas Bulling
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- Machine Reading Comprehension: The Role Of Contextualized Language Models And Beyond Zhuosheng Zhang, Hai Zhao, Rui Wang
- Grounding Language To Autonomously-acquired Skills Via Goal Generation Ahmed Akakzia, Cédric Colas, Pierre-yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
- Mixkd: Towards Efficient Distillation Of Large-scale Language Models Kevin J Liang et al.
- Schema-guided Dialogue State Tracking Task At DSTC8 Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta, Pranav Khaitan
- Incorporating External Knowledge Through Pre-training For Natural Language To Code Generation Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig
- Mintl: Minimalist Transfer Learning For Task-oriented Dialogue Systems Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, Pascale Fung
- Cosda-ml: Multi-lingual Code-switching Data Augmentation For Zero-shot Cross-lingual NLP Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che
- Trading Off Diversity And Quality In Natural Language Generation Hugh Zhang, Daniel Duckworth, Daphne Ippolito, Arvind Neelakantan
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- Template Guided Text Generation For Task-oriented Dialogue Mihir Kale, Abhinav Rastogi
- Robust Encodings: A Framework For Combating Adversarial Typos Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang
- A Controllable Model Of Grounded Response Generation Zeqiu Wu et al.
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Data Manipulation: Towards Effective Instance Learning For Neural Dialogue Generation Via Learning To Augment And Reweight Hengyi Cai et al.
- Mixup-transformer: Dynamic Data Augmentation For NLP Tasks Lichao Sun et al.
- DSTC8-AVSD: Multimodal Semantic Transformer Network With Retrieval Style Word Generator Hwanhee Lee et al.
- Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Ruibo Liu et al.
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- HAT: Hardware-aware Transformers For Efficient Natural Language Processing Hanrui Wang et al.
- Trojaning Language Models For Fun And Profit Xinyang Zhang, Zheng Zhang, Shouling Ji, Ting Wang
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Mitigating Gender Bias For Neural Dialogue Generation With Adversarial Learning Haochen Liu et al.
- Will I Sound Like Me? Improving Persona Consistency In Dialogues Through Pragmatic Self-consciousness Hyunwoo Kim, Byeongchang Kim, Gunhee Kim
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- Wenlan: Bridging Vision And Language By Large-scale Multi-modal Pre-training Yuqi Huo et al.
- Summ^n: A Multi-stage Summarization Framework For Long Input Dialogues And Documents Yusen Zhang et al.
- Unifying Multimodal Transformer For Bi-directional Image And Text Generation Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu
- Bartscore: Evaluating Generated Text As Text Generation Weizhe Yuan, Graham Neubig, Pengfei Liu
- Progressive Transformer-based Generation Of Radiology Reports Farhad Nooralahzadeh, Nicolas Perez Gonzalez, Thomas Frauenfelder, Koji Fujimoto, Michael Krauthammer
- On The Safety Of Conversational Models: Taxonomy, Dataset, And Benchmark Hao Sun et al.
- Retrieval Augmented Code Generation And Summarization Md Rizwan Parvez, Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-wei Chang
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Mitigating Political Bias In Language Models Through Reinforced Calibration Ruibo Liu et al.
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- Unipelt: A Unified Framework For Parameter-efficient Language Model Tuning Yuning Mao et al.
- Revealing Persona Biases In Dialogue Systems Emily Sheng, Josh Arnold, Zhou Yu, Kai-wei Chang, Nanyun Peng
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Arat5: Text-to-text Transformers For Arabic Language Generation El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-mageed
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- Clip4caption: CLIP For Video Caption Mingkang Tang et al.
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- UC2: Universal Cross-lingual Cross-modal Vision-and-language Pre-training Mingyang Zhou et al.
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- PPT: Pre-trained Prompt Tuning For Few-shot Learning Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang
- Zero-shot Recommendation As Language Modeling Damien Sileo, Wout Vossen, Robbe Raymaekers
- Differentially Private Fine-tuning Of Language Models Da Yu et al.
- Why Do Pretrained Language Models Help In Downstream Tasks? An Analysis Of Head And Prompt Tuning Colin Wei, Sang Michael Xie, Tengyu Ma
- Language Model Evaluation Beyond Perplexity Clara Meister, Ryan Cotterell
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- Empowering News Recommendation With Pre-trained Language Models Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-image Pre-training Paradigm Yangguang Li et al.
- LFPT5: A Unified Framework For Lifelong Few-shot Language Learning Based On Prompt Tuning Of T5 Chengwei Qin, Shafiq Joty
- Efficient Retrieval Augmented Generation From Unstructured Knowledge For Task-oriented Dialog David Thulke, Nico Daheim, Christian Dugast, Hermann Ney
- Pre-train, Prompt, And Predict: A Systematic Survey Of Prompting Methods In Natural Language Processing Pengfei Liu et al.
- Terapipe: Token-level Pipeline Parallelism For Training Large-scale Language Models Zhuohan Li et al.
- N\"UWA: Visual Synthesis Pre-training For Neural Visual World Creation Chenfei Wu et al.
- Non-invasive Self-attention For Side Information Fusion In Sequential Recommendation Chang Liu et al.
- Simvlm: Simple Visual Language Model Pretraining With Weak Supervision Zirui Wang et al.
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- A Short Survey Of Pre-trained Language Models For Conversational AI-A Newage In NLP Munazza Zaib, Quan Z. Sheng, Wei Emma Zhang
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- Dynaboard: An Evaluation-as-a-service Platform For Holistic Next-generation Benchmarking Zhiyi Ma et al.
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- Maria: Spanish Language Models Asier Gutiérrez-fandiño et al.
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- Multilingual LAMA: Investigating Knowledge In Multilingual Pretrained Language Models Nora Kassner, Philipp Dufter, Hinrich Schütze
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- GLM: General Language Model Pretraining With Autoregressive Blank Infilling Zhengxiao Du et al.
- Demix Layers: Disentangling Domains For Modular Language Modeling Suchin Gururangan, Mike Lewis, Ari Holtzman, Noah A. Smith, Luke Zettlemoyer
- Denseclip: Language-guided Dense Prediction With Context-aware Prompting Yongming Rao et al.
- COCO-LM: Correcting And Contrasting Text Sequences For Language Model Pretraining Yu Meng et al.
- Quiz-style Question Generation For News Stories Adam D. Lelkes, Vinh Q. Tran, Cong Yu
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- Constrained Language Models Yield Few-shot Semantic Parsers Richard Shin et al.
- OPT: Omni-perception Pre-trainer For Cross-modal Understanding And Generation Jing Liu et al.
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- I-BERT: Integer-only BERT Quantization Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Deltalm: Encoder-decoder Pre-training For Language Generation And Translation By Augmenting Pretrained Multilingual Encoders Shuming Ma et al.
- Metaicl: Learning To Learn In Context Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Codexglue: A Machine Learning Benchmark Dataset For Code Understanding And Generation Shuai Lu et al.
- Multimodal Few-shot Learning With Frozen Language Models Maria Tsimpoukelli et al.
- Unifying Vision-and-language Tasks Via Text Generation Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal
- Redditbias: A Real-world Resource For Bias Evaluation And Debiasing Of Conversational Language Models Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- Reacc: A Retrieval-augmented Code Completion Framework Shuai Lu et al.
- Coderl: Mastering Code Generation Through Pretrained Models And Deep Reinforcement Learning Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- Selective Annotation Makes Language Models Better Few-shot Learners Hongjin Su et al.
- Teaching Algorithmic Reasoning Via In-context Learning Hattie Zhou et al.
- In-context Learning For Few-shot Dialogue State Tracking Yushi Hu et al.
- Contrastive Learning With Bidirectional Transformers For Sequential Recommendation Hanwen Du et al.
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- Pali: A Jointly-scaled Multilingual Language-image Model Xi Chen et al.
- Diffusiondb: A Large-scale Prompt Gallery Dataset For Text-to-image Generative Models Zijie J. Wang et al.
- Lost At C: A User Study On The Security Implications Of Large Language Model Code Assistants Gustavo Sandoval et al.
- Dialfred: Dialogue-enabled Agents For Embodied Instruction Following Xiaofeng Gao et al.
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored To Political Identity Gabriel Simmons
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Re3: Generating Longer Stories With Recursive Reprompting And Revision Kevin Yang, Yuandong Tian, Nanyun Peng, Dan Klein
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Large Language Models Encode Clinical Knowledge Karan Singhal et al.
- Minicons: Enabling Flexible Behavioral And Representational Analyses Of Transformer Language Models Kanishka Misra
- Healthprompt: A Zero-shot Learning Paradigm For Clinical Natural Language Processing Sonish Sivarajkumar, Yanshan Wang
- BLIP: Bootstrapping Language-image Pre-training For Unified Vision-language Understanding And Generation Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
- Automatic Generation Of Programming Exercises And Code Explanations Using Large Language Models Sami Sarsa, Paul Denny, Arto Hellas, Juho Leinonen
- Deepspeed-moe: Advancing Mixture-of-experts Inference And Training To Power Next-generation AI Scale Samyam Rajbhandari et al.
- OPT-IML: Scaling Language Model Instruction Meta Learning Through The Lens Of Generalization Srinivasan Iyer et al.
- Instruction Tuning For Few-shot Aspect-based Sentiment Analysis Siddharth Varia et al.
- Controllable Natural Language Generation With Contrastive Prefixes Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen
- Incorporating Domain Knowledge Through Task Augmentation For Front-end Javascript Code Generation Sijie Shen et al.
- Knowledge Prompting In Pre-trained Language Model For Natural Language Understanding Jianing Wang et al.
- Improving The Domain Adaptation Of Retrieval Augmented Generation (RAG) Models For Open Domain Question Answering Shamane Siriwardhana et al.
- React: Synergizing Reasoning And Acting In Language Models Shunyu Yao et al.
- Flamingo: A Visual Language Model For Few-shot Learning Jean-baptiste Alayrac et al.
- Benchmarking Large Language Models For Automated Verilog RTL Code Generation Shailja Thakur et al.
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- Action-gpt: Leveraging Large-scale Language Models For Improved And Generalized Action Generation Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla
- Prompting Is Programming: A Query Language For Large Language Models Luca Beurer-kellner, Marc Fischer, Martin Vechev
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Instructionner: A Multi-task Instruction-based Generative Framework For Few-shot NER Liwen Wang et al.
- Efficient Few-shot Learning Without Prompts Lewis Tunstall et al.
- Is Reinforcement Learning (not) For Natural Language Processing: Benchmarks, Baselines, And Building Blocks For Natural Language Policy Optimization Rajkumar Ramamurthy et al.
- Who Is GPT-3? An Exploration Of Personality, Values And Demographics Marilù Miotto, Nicola Rossberg, Bennett Kleinberg
- OPT: Open Pre-trained Transformer Language Models Susan Zhang et al.
- Vindlu: A Recipe For Effective Video-and-language Pretraining Feng Cheng et al.
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Long Time No See! Open-domain Conversation With Long-term Persona Memory Xinchao Xu et al.
- Codegen: An Open Large Language Model For Code With Multi-turn Program Synthesis Erik Nijkamp et al.
- Capturing Failures Of Large Language Models Via Human Cognitive Biases Erik Jones, Jacob Steinhardt
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- Self-adaptive In-context Learning: An Information Compression Perspective For In-context Example Selection And Ordering Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Language And Culture Internalisation For Human-like Autotelic AI Cédric Colas, Tristan Karch, Clément Moulin-frier, Pierre-yves Oudeyer
- Competition-level Code Generation With Alphacode Yujia Li et al.
- Cont: Contrastive Neural Text Generation Chenxin An et al.
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- A Unified End-to-end Retriever-reader Framework For Knowledge-based VQA Yangyang Guo et al.
- A Unified Multi-task Learning Framework For Multi-goal Conversational Recommender Systems Yang Deng et al.
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Iteratively Prompt Pre-trained Language Models For Chain Of Thought Boshi Wang, Xiang Deng, Huan Sun
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- Attributed Question Answering: Evaluation And Modeling For Attributed Large Language Models Bernd Bohnet et al.
- Multi-lingual Evaluation Of Code Generation Models Ben Athiwaratkun et al.
- T-NER: An All-round Python Library For Transformer-based Named Entity Recognition Asahi Ushio, Jose Camacho-collados
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- Grips: Gradient-free, Edit-based Instruction Search For Prompting Large Language Models Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
- Selection-inference: Exploiting Large Language Models For Interpretable Logical Reasoning Antonia Creswell, Murray Shanahan, Irina Higgins
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- Plug-and-play VQA: Zero-shot VQA By Conjoining Large Pretrained Models With Zero Training Anthony Meng Huat Tiong, Junnan Li, Boyang Li, Silvio Savarese, Steven C. H. Hoi
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Socratic Models: Composing Zero-shot Multimodal Reasoning With Language Andy Zeng et al.
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- Prompt-to-prompt Image Editing With Cross Attention Control Amir Hertz et al.
- Commonsenseqa 2.0: Exposing The Limits Of AI Through Gamification Alon Talmor et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- Dualprompt: Complementary Prompting For Rehearsal-free Continual Learning Zifeng Wang et al.
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- What Is It Like To Program With Artificial Intelligence? Advait Sarkar et al.
- Scaling Up Models And Data With \(\texttt{t5x}\) And \(\texttt{seqio}\) Adam Roberts et al.
- TALM: Tool Augmented Language Models Aaron Parisi, Yao Zhao, Noah Fiedel
- Solving Quantitative Reasoning Problems With Language Models Aitor Lewkowycz et al.
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- Code Generation Tools (almost) For Free? A Study Of Few-shot, Pre-trained Language Models On Code Patrick Bareiß, Beatriz Souza, Marcelo D'amorim, Michael Pradel
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- Unifiedskg: Unifying And Multi-tasking Structured Knowledge Grounding With Text-to-text Language Models Tianbao Xie et al.
- Vl-checklist: Evaluating Pre-trained Vision-language Models With Objects, Attributes And Relations Tiancheng Zhao et al.
- Talking About Large Language Models Murray Shanahan
- Toxigen: A Large-scale Machine-generated Dataset For Adversarial And Implicit Hate Speech Detection Thomas Hartvigsen et al.
- Efficient Training Of Language Models To Fill In The Middle Mohammad Bavarian et al.
- KALA: Knowledge-augmented Language Model Adaptation Minki Kang, Jinheon Baek, Sung Ju Hwang
- Towards A Unified Multi-dimensional Evaluator For Text Generation Ming Zhong et al.
- Evaluating Human-language Model Interaction Mina Lee et al.
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- Confident Adaptive Language Modeling Tal Schuster et al.
- Promptsource: An Integrated Development Environment And Repository For Natural Language Prompts Stephen H. Bach et al.
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- Instructdial: Improving Zero And Few-shot Generalization In Dialogue Through Instruction Tuning Prakhar Gupta et al.
- Co-writing Screenplays And Theatre Scripts With Language Models: An Evaluation By Industry Professionals Piotr Mirowski, Kory W. Mathewson, Jaylen Pittman, Richard Evans
- 3DALL-E: Integrating Text-to-image AI In 3D Design Workflows Vivian Liu, Jo Vermeulen, George Fitzmaurice, Justin Matejka
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- Quantifying Language Models' Sensitivity To Spurious Features In Prompt Design Or: How I Learned To Start Worrying About Prompt Formatting Melanie Sclar, Yejin Choi, Yulia Tsvetkov, Alane Suhr
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Can Llms Express Their Uncertainty? An Empirical Evaluation Of Confidence Elicitation In Llms Miao Xiong et al.
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Leancontext: Cost-efficient Domain-specific Question Answering Using Llms Md Adnan Arefeen, Biplob Debnath, Srimat Chakradhar
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Nl2spec: Interactively Translating Unstructured Natural Language To Temporal Logics With Large Language Models Matthias Cosler, Christopher Hahn, Daniel Mendoza, Frederik Schmitt, Caroline Trippel
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- Flexkbqa: A Flexible Llm-powered Framework For Few-shot Knowledge Base Question Answering Zhenyu Li et al.
- Generative Artificial Intelligence In Learning Analytics: Contextualising Opportunities And Challenges Through The Learning Analytics Cycle Lixiang Yan, Roberto Martinez-maldonado, Dragan Gašević
- From Word Models To World Models: Translating From Natural Language To The Probabilistic Language Of Thought Lionel Wong et al.
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- Do Llms Exhibit Human-like Response Biases? A Case Study In Survey Design Lindia Tjuatja, Valerie Chen, Sherry Tongshuang Wu, Ameet Talwalkar, Graham Neubig
- Reasoning On Graphs: Faithful And Interpretable Large Language Model Reasoning Linhao Luo, Yuan-fang Li, Gholamreza Haffari, Shirui Pan
- Judging Llm-as-a-judge With Mt-bench And Chatbot Arena Lianmin Zheng et al.
- Logic-lm: Empowering Large Language Models With Symbolic Solvers For Faithful Logical Reasoning Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Can Chatgpt Replace Stackoverflow? A Study On Robustness And Reliability Of Large Language Model Code Generation Li Zhong, Zilong Wang
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Superclue: A Comprehensive Chinese Large Language Model Benchmark Liang Xu et al.
- 14 Examples Of How Llms Can Transform Materials Science And Chemistry: A Reflection On A Large Language Model Hackathon Kevin Maik Jablonka et al.
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- Domain-specific Chatbots For Science Using Embeddings Kevin G. Yager
- LLM In A Flash: Efficient Large Language Model Inference With Limited Memory Keivan Alizadeh et al.
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Evaluating Language Models For Mathematics Through Interactions Katherine M. Collins et al.
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Waffling Around For Performance: Visual Classification With Random Words And Broad Concepts Karsten Roth et al.
- Towards Expert-level Medical Question Answering With Large Language Models Karan Singhal et al.
- Chipgpt: How Far Are We From Natural Language Hardware Design Kaiyan Chang et al.
- Speechprompt V2: Prompt Tuning For Speech Classification Tasks Kai-wei Chang et al.
- Aligning Instruction Tasks Unlocks Large Language Models As Zero-shot Relation Extractors Kai Zhang, Bernal Jiménez Gutiérrez, Yu Su
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- Evaluation And Analysis Of Hallucination In Large Vision-language Models Junyang Wang et al.
- Ai-augmented Surveys: Leveraging Large Language Models And Surveys For Opinion Prediction Junsol Kim, Byungkyu Lee
- Agentcf: Collaborative Learning With Autonomous Language Agents For Recommender Systems Junjie Zhang et al.
- Recommendation As Instruction Following: A Large Language Model Empowered Recommendation Approach Junjie Zhang et al.
- Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, Chun Zuo
- Evaluating GPT-4 And Chatgpt On Japanese Medical Licensing Examinations Jungo Kasai, Yuhei Kasai, Keisuke Sakaguchi, Yutaro Yamada, Dragomir Radev
- Backdooring Instruction-tuned Large Language Models With Virtual Prompt Injection Jun Yan et al.
- MEGA: Multilingual Evaluation Of Generative AI Kabir Ahuja et al.
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- Gptscore: Evaluate As You Desire Jinlan Fu, See-kiong Ng, Zhengbao Jiang, Pengfei Liu
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Structgpt: A General Framework For Large Language Model To Reason Over Structured Data Jinhao Jiang et al.
- When Large Language Models Meet Personalization: Perspectives Of Challenges And Opportunities Jin Chen et al.
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Ethical Chatgpt: Concerns, Challenges, And Commandments Jianlong Zhou, Heimo Müller, Andreas Holzinger, Fang Chen
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- The Impact Of Chatgpt And Llms On Medical Imaging Stakeholders: Perspectives And Use Cases Jiancheng Yang, Hongwei Bran Li, Donglai Wei
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- Onellm: One Framework To Align All Modalities With Language Jiaming Han et al.
- ICL-D3IE: In-context Learning With Diverse Demonstrations Updating For Document Information Extraction Jiabang He et al.
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- Unlearn What You Want To Forget: Efficient Unlearning For Llms Jiaao Chen, Diyi Yang
- VILA: On Pre-training For Visual Language Models Ji Lin et al.
- Large Language Models In Medicine: The Potentials And Pitfalls Jesutofunmi A. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou
- AWQ: Activation-aware Weight Quantization For LLM Compression And Acceleration Ji Lin et al.
- Physically Grounded Vision-language Models For Robotic Manipulation Jensen Gao et al.
- Artificial Muses: Generative Artificial Intelligence Chatbots Have Risen To Human-level Creativity Jennifer Haase, Paul H. P. Hanel
- Prompting Is Not A Substitute For Probability Measurements In Large Language Models Jennifer Hu, Roger Levy
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- More Robots Are Coming: Large Multimodal Models (chatgpt) Can Solve Visually Diverse Images Of Parsons Problems Irene Hou et al.
- Factuality Challenges In The Era Of Large Language Models Isabelle Augenstein et al.
- Chainforge: A Visual Toolkit For Prompt Engineering And LLM Hypothesis Testing Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Martin Wattenberg, Elena Glassman
- A Comprehensive Overview Of Large Language Models Humza Naveed et al.
- The Bigscience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Hugo Laurençon et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Chatgpt Chemistry Assistant For Text Mining And Prediction Of MOF Synthesis Zhiling Zheng, Oufan Zhang, Christian Borgs, Jennifer T. Chayes, Omar M. Yaghi
- Building Cooperative Embodied Agents Modularly With Large Language Models Hongxin Zhang et al.
- Llmind: Orchestrating AI And Iot With LLM For Complex Task Execution Hongwei Cui, Yuyang Du, Qun Yang, Yulin Shao, Soung Chang Liew
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Self-chained Image-language Model For Video Localization And Question Answering Shoubin Yu, Jaemin Cho, Prateek Yadav, Mohit Bansal
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Large Language Model Augmented Narrative Driven Recommendations Sheshera Mysore, Andrew Mccallum, Hamed Zamani
- Toolkengpt: Augmenting Frozen Language Models With Massive Tools Via Tool Embeddings Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu
- Gorilla: Large Language Model Connected With Massive Apis Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez
- Mixture-of-experts Meets Instruction Tuning:a Winning Combination For Large Language Models Sheng Shen et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Luminate: Structured Generation And Exploration Of Design Space With Large Language Models For Human-ai Co-creation Sangho Suh, Meng Chen, Bryan Min, Toby Jia-jun Li, Haijun Xia
- Chatgpt Or Human? Detect And Explain. Explaining Decisions Of Machine Learning Model For Detecting Short Chatgpt-generated Text Sandra Mitrović, Davide Andreoletti, Omran Ayoub
- Ai-assisted Coding: Experiments With GPT-4 Russell A Poldrack, Thomas Lu, Gašper Beguš
- Verify-and-edit: A Knowledge-enhanced Chain-of-thought Framework Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, Lidong Bing
- Does Synthetic Data Generation Of Llms Help Clinical Text Mining? Ruixiang Tang, Xiaotian Han, Xiaoqian Jiang, Xia Hu
- Chatgpt Vs. Google: A Comparative Study Of Search Performance And User Experience Ruiyun Rayna Xu, Yue Katherine Feng, Hailiang Chen
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- In-context Learning Creates Task Vectors Roee Hendel, Mor Geva, Amir Globerson
- Llm-assisted Content Analysis: Using Large Language Models To Support Deductive Coding Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- A Universal Question-answering Platform For Knowledge Graphs Reham Omar, Ishika Dhall, Panos Kalnis, Essam Mansour
- Automatic Prompt Optimization With "gradient Descent" And Beam Search Reid Pryzant et al.
- Lawyer Llama Technical Report Quzhe Huang et al.
- Codegeex: A Pre-trained Model For Code Generation With Multilingual Benchmarking On Humaneval-x Qinkai Zheng et al.
- ONCE: Boosting Content-based Recommendation With Both Open- And Closed-source Large Language Models Qijiong Liu, Nuo Chen, Tetsuya Sakai, Xiao-ming Wu
- Faithful Chain-of-thought Reasoning Qing Lyu et al.
- Autogen: Enabling Next-gen LLM Applications Via Multi-agent Conversation Qingyun Wu et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Genegpt: Augmenting Large Language Models With Domain Tools For Improved Access To Biomedical Information Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu
- Designerly Understanding: Information Needs For Model Transparency To Support Design Ideation For Ai-powered User Experience Q. Vera Liao, Hariharan Subramonyam, Jennifer Wang, Jennifer Wortman Vaughan
- Students' Perceptions And Preferences Of Generative Artificial Intelligence Feedback For Programming Zhengdong Zhang et al.
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Visually-prompted Language Model For Fine-grained Scene Graph Generation In An Open World Qifan Yu et al.
- Chat-univi: Unified Visual Representation Empowers Large Language Models With Image And Video Understanding Peng Jin, Ryuichi Takanobu, Wancai Zhang, Xiaochun Cao, Li Yuan
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- Starcoder: May The Source Be With You! Raymond Li et al.
- VISAR: A Human-ai Argumentative Writing Assistant With Visual Programming And Rapid Draft Prototyping Zheng Zhang, Jie Gao, Ranjodh Singh Dhaliwal, Toby Jia-jun Li
- In-context Retrieval-augmented Language Models Ori Ram et al.
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- Chameleon: Plug-and-play Compositional Reasoning With Large Language Models Pan Lu et al.
- Faith And Fate: Limits Of Transformers On Compositionality Nouha Dziri et al.
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- CAT-LM: Training Language Models On Aligned Code And Tests Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn
- Self-contradictory Hallucinations Of Large Language Models: Evaluation, Detection And Mitigation Niels Mündler, Jingxuan He, Slobodan Jenko, Martin Vechev
- Consistency Analysis Of Chatgpt Myeongjun Erik Jang, Thomas Lukasiewicz
- Benefits And Harms Of Large Language Models In Digital Mental Health Munmun De Choudhury, Sachin R. Pendse, Neha Kumar
- Self-regulating Prompts: Foundational Model Adaptation Without Forgetting Muhammad Uzair Khattak et al.
- Abscribe: Rapid Exploration & Organization Of Multiple Writing Variations In Human-ai Co-writing Tasks Using Large Language Models Mohi Reza et al.
- Api-bank: A Comprehensive Benchmark For Tool-augmented Llms Minghao Li et al.
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Detecting Llm-generated Text In Computing Education: A Comparative Study For Chatgpt Cases Michael Sheinman Orenstrakh, Oscar Karnalim, Carlos Anibal Suarez, Michael Liut
- Selenite: Scaffolding Online Sensemaking With Comprehensive Overviews Elicited From Large Language Models Michael Xieyang Liu et al.
- Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- Psy-llm: Scaling Up Global Mental Health Psychological Services With Ai-based Large Language Models Tin Lai et al.
- Toolformer: Language Models Can Teach Themselves To Use Tools Timo Schick et al.
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Few-shot In-context Learning For Knowledge Base Question Answering Tianle Li et al.
- Cognitive Architectures For Language Agents Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- AI, Write An Essay For Me: A Large-scale Comparison Of Human-written Versus Chatgpt-generated Essays Steffen Herbold, Annette Hautli-janisz, Ute Heuer, Zlata Kikteva, Alexander Trautsch
- Chatgpt Perpetuates Gender Bias In Machine Translation And Ignores Non-gendered Pronouns: Findings Across Bengali And Five Other Low-resource Languages Sourojit Ghosh, Aylin Caliskan
- Pythia: A Suite For Analyzing Large Language Models Across Training And Scaling Stella Biderman et al.
- Metagpt: Meta Programming For A Multi-agent Collaborative Framework Sirui Hong et al.
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- Mind Meets Machine: Unravelling Gpt-4's Cognitive Psychology Sifatkaur Dhingra, Manmeet Singh, Vaisakh Sb, Neetiraj Malviya, Sukhpal Singh Gill
- Mitigating Object Hallucinations In Large Vision-language Models Through Visual Contrastive Decoding Sicong Leng et al.
- Do Generative Large Language Models Need Billions Of Parameters? Sia Gholami, Marwan Omar
- Tree Of Thoughts: Deliberate Problem Solving With Large Language Models Shunyu Yao et al.
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- LIDA: A Tool For Automatic Generation Of Grammar-agnostic Visualizations And Infographics Using Large Language Models Victor Dibia
- Can Ai-generated Text Be Reliably Detected? Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, Soheil Feizi
- Fully Autonomous Programming With Large Language Models Vadim Liventsev, Anastasiia Grishina, Aki Härmä, Leon Moonen
- Automated Reading Passage Generation With Openai's Large Language Model Ummugul Bezirhan, Matthias Von Davier
- Automating Human Tutor-style Programming Feedback: Leveraging GPT-4 Tutor Model For Hint Generation And GPT-3.5 Student Model For Hint Validation Tung Phung et al.
- Generative AI For Programming Education: Benchmarking Chatgpt, GPT-4, And Human Tutors Tung Phung et al.
- Creativity Support In The Age Of Large Language Models: An Empirical Study Involving Emerging Writers Tuhin Chakrabarty, Vishakh Padmakumar, Faeze Brahman, Smaranda Muresan
- Mm-vet: Evaluating Large Multimodal Models For Integrated Capabilities Weihao Yu et al.
- Promptcblue: A Chinese Prompt Tuning Benchmark For The Medical Domain Wei Zhu, Xiaoling Wang, Huanran Zheng, Mosha Chen, Buzhou Tang
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- Copiloting The Copilots: Fusing Large Language Models With Completion Engines For Automated Program Repair Yuxiang Wei, Chunqiu Steven Xia, Lingming Zhang
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- Can Large Language Models Provide Useful Feedback On Research Papers? A Large-scale Empirical Analysis Weixin Liang et al.
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Large Language Models As Zero-shot Conversational Recommenders Zhankui He et al.
- Chatgraph: Interpretable Text Classification By Converting Chatgpt Knowledge To Graphs Yucheng Shi et al.
- R2gengpt: Radiology Report Generation With Frozen Llms Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou
- Chatkbqa: A Generate-then-retrieve Framework For Knowledge Base Question Answering With Fine-tuned Large Language Models Haoran Luo et al.
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Lmdrive: Closed-loop End-to-end Driving With Large Language Models Hao Shao et al.
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Reasoning Implicit Sentiment With Chain-of-thought Prompting Hao Fei et al.
- CRITIC: Large Language Models Can Self-correct With Tool-interactive Critiquing Zhibin Gou et al.
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Video-llama: An Instruction-tuned Audio-visual Language Model For Video Understanding Hang Zhang, Xin Li, Lidong Bing
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Wizardmath: Empowering Mathematical Reasoning For Large Language Models Via Reinforced Evol-instruct Haipeng Luo et al.
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Augmented Language Models: A Survey Grégoire Mialon et al.
- Level Generation Through Large Language Models Graham Todd, Sam Earle, Muhammad Umair Nasir, Michael Cerny Green, Julian Togelius
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Repocoder: Repository-level Code Completion Through Iterative Retrieval And Generation Fengji Zhang et al.
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Exploring Human-like Translation Strategy With Large Language Models Zhiwei He et al.
- Chatgpt Outperforms Crowd-workers For Text-annotation Tasks Fabrizio Gilardi, Meysam Alizadeh, Maël Kubli
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Assigning AI: Seven Approaches For Students, With Prompts Ethan Mollick, Lilach Mollick
- Llm-adapters: An Adapter Family For Parameter-efficient Fine-tuning Of Large Language Models Zhiqiang Hu et al.
- Gptutor: A Chatgpt-powered Programming Tool For Code Explanation Eason Chen, Ray Huang, Han-shin Chen, Yuen-hsien Tseng, Liang-yi Li
- Vipergpt: Visual Inference Via Python Execution For Reasoning Dídac Surís, Sachit Menon, Carl Vondrick
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- MELTR: Meta Loss Transformer For Learning To Fine-tune Video Foundation Models Dohwan Ko et al.
- Using An LLM To Help With Code Understanding Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, Brad Myers
- CORE-GPT: Combining Open Access Research And Large Language Models For Credible, Trustworthy Question Answering David Pride, Matteo Cancellieri, Petr Knoth
- Exploiting Programmatic Behavior Of Llms: Dual-use Through Standard Security Attacks Daniel Kang et al.
- Almanac: Retrieval-augmented Language Models For Clinical Medicine Cyril Zakka et al.
- AI And The FCI: Can Chatgpt Project An Understanding Of Introductory Physics? Colin G. West
- Llava-med: Training A Large Language-and-vision Assistant For Biomedicine In One Day Chunyuan Li et al.
- Multimodal Foundation Models: From Specialists To General-purpose Assistants Chunyuan Li et al.
- Drivelm: Driving With Graph Visual Question Answering Chonghao Sima et al.
- Opportunities And Risks Of Llms For Scalable Deliberation With Polis Christopher T. Small et al.
- Whitefox: White-box Compiler Fuzzing Empowered By Large Language Models Chenyuan Yang et al.
- Chateval: Towards Better Llm-based Evaluators Through Multi-agent Debate Chi-min Chan et al.
- Distilled GPT For Source Code Summarization Chia-yi Su, Collin Mcmillan
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Chatdev: Communicative Agents For Software Development Chen Qian et al.
- Supporting Human-ai Collaboration In Auditing Llms With Llms Charvi Rastogi, Marco Tulio Ribeiro, Nicholas King, Harsha Nori, Saleema Amershi
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- K2: A Foundation Language Model For Geoscience Knowledge Understanding And Utilization Cheng Deng et al.
- Llmseceval: A Dataset Of Natural Language Prompts For Security Evaluations Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, Riccardo Scandariato
- Receive, Reason, And React: Drive As You Say With Large Language Models In Autonomous Vehicles Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang
- Prompting Or Fine-tuning? A Comparative Study Of Large Language Models For Taxonomy Construction Boqi Chen, Fandi Yi, Dániel Varró
- LLM+P: Empowering Large Language Models With Optimal Planning Proficiency Bo Liu et al.
- Evaluation Of Chatgpt For Nlp-based Mental Health Applications Bishal Lamichhane
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- ART: Automatic Multi-step Reasoning And Tool-use For Large Language Models Bhargavi Paranjape et al.
- Can Large Language Models Transform Computational Social Science? Caleb Ziems et al.
- Facilitating Self-guided Mental Health Interventions Through Human-language Model Interaction: A Case Study Of Cognitive Restructuring Ashish Sharma, Kevin Rushton, Inna Wanyin Lin, Theresa Nguyen, Tim Althoff
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Chemcrow: Augmenting Large-language Models With Chemistry Tools Andres M Bran et al.
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models Anas Awadalla et al.
- On Generative Agents In Recommendation An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-seng Chua
- Robots That Ask For Help: Uncertainty Alignment For Large Language Model Planners Allen Z. Ren et al.
- Lamp: When Large Language Models Meet Personalization Alireza Salemi, Sheshera Mysore, Michael Bendersky, Hamed Zamani
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Chatgpt: More Than A Weapon Of Mass Deception, Ethical Challenges And Responses From The Human-centered Artificial Intelligence (HCAI) Perspective Alejo Jose G. Sison, Marco Tulio Daza, Roberto Gozalo-brizuela, Eduardo C. Garrido-merchán
- Clipsyntel: CLIP And LLM Synergy For Multimodal Question Summarization In Healthcare Akash Ghosh et al.
- Self-rag: Learning To Retrieve, Generate, And Critique Through Self-reflection Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi
- Can Chatgpt And Bard Generate Aligned Assessment Items? A Reliability Analysis Against Human Performance Abdolvahab Khademi
- Conversational Ai-powered Design: Chatgpt As Designer, User, And Product A. Baki Kocaballi
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- Beyond Chain-of-thought, Effective Graph-of-thought Reasoning In Language Models Yao Yao, Zuchao Li, Hai Zhao
- Chatpose: Chatting About 3D Human Pose Yao Feng et al.
- Better To Ask In English: Cross-lingual Evaluation Of Large Language Models For Healthcare Queries Yiqiao Jin et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- Enhancing Job Recommendation Through Llm-based Generative Adversarial Networks Yingpeng Du et al.
- Improving Factuality And Reasoning In Language Models Through Multiagent Debate Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch
- Can Chatgpt Replace Traditional KBQA Models? An In-depth Analysis Of The Question Answering Performance Of The GPT LLM Family Yiming Tan et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- NL2TL: Transforming Natural Languages To Temporal Logics Using Large Language Models Yongchao Chen, Rujul Gandhi, Yang Zhang, Chuchu Fan
- How Far Can Camels Go? Exploring The State Of Instruction Tuning On Open Resources Yizhong Wang et al.
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- The Dark Side Of Chatgpt: Legal And Ethical Challenges From Stochastic Parrots And Hallucination Zihao Li
- Llavar: Enhanced Visual Instruction Tuning For Text-rich Image Understanding Yanzhe Zhang et al.
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Llama-vid: An Image Is Worth 2 Tokens In Large Language Models Yanwei Li, Chengyao Wang, Jiaya Jia
- G-eval: NLG Evaluation Using GPT-4 With Better Human Alignment Yang Liu et al.
- Recmind: Large Language Model Powered Agent For Recommendation Yancheng Wang et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Chat With The Environment: Interactive Multimodal Perception Using Large Language Models Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
- Integrating Action Knowledge And Llms For Task Planning And Situation Handling In Open Worlds Yan Ding et al.
- Can Chatgpt Pass The Vietnamese National High School Graduation Examination? Xuan-quy Dao, Ngoc-bich Le, Xuan-dung Phan, Bac-bien Ngo
- Performance Comparison Of Large Language Models On VNHSGE English Dataset: Openai Chatgpt, Microsoft Bing Chat, And Google Bard Xuan-quy Dao
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- Mitigating Large Language Model Hallucinations Via Autonomous Knowledge Graph-based Retrofitting Xinyan Guan et al.
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- Rethinking The Evaluation For Conversational Recommendation In The Era Of Large Language Models Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-rong Wen
- Xuanyuan 2.0: A Large Chinese Financial Chat Model With Hundreds Of Billions Parameters Xuanyu Zhang, Qing Yang, Dongliang Xu
- Deceptive AI Ecosystems: The Case Of Chatgpt Xiao Zhan, Yifan Xu, Stefan Sarkadi
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Medagents: Large Language Models As Collaborators For Zero-shot Medical Reasoning Xiangru Tang et al.
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Large Language Models In Education: Vision And Opportunities Wensheng Gan, Zhenlian Qi, Jiayang Wu, Jerry Chun-wei Lin
- LIV: Language-image Representations And Rewards For Robotic Control Yecheng Jason Ma et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Chatdoctor: A Medical Chat Model Fine-tuned On A Large Language Model Meta-ai (llama) Using Medical Domain Knowledge Yunxiang Li et al.
- Lampilot: An Open Benchmark Dataset For Autonomous Driving With Language Model Programs Yunsheng Ma et al.
- Towards Open-world Recommendation With Knowledge Augmentation From Large Language Models Yunjia Xi et al.
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Tool Learning With Foundation Models Yujia Qin et al.
- Character-llm: A Trainable Agent For Role-playing Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu
- Contextual Object Detection With Multimodal Large Language Models Yuhang Zang, Wei Li, Jun Han, Kaiyang Zhou, Chen Change Loy
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Toolqa: A Dataset For LLM Question Answering With External Tools Yuchen Zhuang, Yue Yu, Kuan Wang, Haotian Sun, Chao Zhang
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Fine-grained Human Feedback Gives Better Rewards For Language Model Training Zeqiu Wu et al.
- C-eval: A Multi-level Multi-discipline Chinese Evaluation Suite For Foundation Models Yuzhen Huang et al.
- Let The Llms Talk: Simulating Human-to-human Conversational QA Via Zero-shot Llm-to-llm Interactions Zahra Abbasiantaeb, Yifei Yuan, Evangelos Kanoulas, Mohammad Aliannejadi
- Hard Prompts Made Easy: Gradient-based Discrete Optimization For Prompt Tuning And Discovery Yuxin Wen et al.
- Chatbot Arena: An Open Platform For Evaluating Llms By Human Preference Wei-lin Chiang et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Transformers Are Ssms: Generalized Models And Efficient Algorithms Through Structured State Space Duality Tri Dao, Albert Gu
- Continual Learning For Large Language Models: A Survey Tongtong Wu et al.
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- Towards Conversational Diagnostic AI Tao Tu et al.
- Chatgpt As Research Scientist: Probing Gpt's Capabilities As A Research Librarian, Research Ethicist, Data Generator And Data Predictor Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R. Banaji
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- An Empirical Study On Usage And Perceptions Of Llms In A Software Engineering Project Sanka Rasnayaka, Guanlin Wang, Ridwan Shariffdeen, Ganesh Neelakanta Iyer
- Beyond Code Generation: An Observational Study Of Chatgpt Usage In Software Engineering Practice Ranim Khojah, Mazen Mohamad, Philipp Leitner, Francisco Gomes De Oliveira Neto
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- SNIFFER: Multimodal Large Language Model For Explainable Out-of-context Misinformation Detection Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee
- Shaping Human-ai Collaboration: Varied Scaffolding Levels In Co-writing With Language Models Paramveer S. Dhillon et al.
- Ai-augmented Brainwriting: Investigating The Use Of Llms In Group Ideation Orit Shaer, Angelora Cooper, Osnat Mokryn, Andrew L. Kun, Hagit Ben Shoshan
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- Same Task, More Tokens: The Impact Of Input Length On The Reasoning Performance Of Large Language Models Mosh Levy, Alon Jacoby, Yoav Goldberg
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- Large Legal Fictions: Profiling Legal Hallucinations In Large Language Models Matthew Dahl, Varun Magesh, Mirac Suzgun, Daniel E. Ho
- A Piece Of Theatre: Investigating How Teachers Design LLM Chatbots To Assist Adolescent Cyberbullying Education Michael A. Hedderich et al.
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Data Is All You Need: Finetuning Llms For Chip Design Via An Automated Design-data Augmentation Framework Kaiyan Chang et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- (A)I Am Not A Lawyer, But...: Engaging Legal Experts Towards Responsible LLM Policies For Legal Advice Inyoung Cheong, King Xia, K. J. Kevin Feng, Quan Ze Chen, Amy X. Zhang
- Materials Science In The Era Of Large Language Models: A Perspective Ge Lei, Ronan Docherty, Samuel J. Cooper
- Code-aware Prompting: A Study Of Coverage Guided Test Generation In Regression Setting Using LLM Gabriel Ryan et al.
- Ai-tutoring In Software Engineering Education Eduard Frankford, Clemens Sauerwein, Patrick Bassner, Stephan Krusche, Ruth Breu
- Chemllm: A Chemical Large Language Model Di Zhang et al.
- Deepseek-coder: When The Large Language Model Meets Programming -- The Rise Of Code Intelligence Daya Guo et al.
- Generative AI In EU Law: Liability, Privacy, Intellectual Property, And Cybersecurity Claudio Novelli, Federico Casolari, Philipp Hacker, Giorgio Spedicato, Luciano Floridi
- Homogenization Effects Of Large Language Models On Human Creative Ideation Barrett R. Anderson, Jash Hemant Shah, Max Kreminski
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- Generative AI In Education: A Study Of Educators' Awareness, Sentiments, And Influencing Factors Aashish Ghimire, James Prather, John Edwards
- Yi: Open Foundation Models By 01.AI 01. Ai et al.
- Harnessing Large Language Models For Text-rich Sequential Recommendation Zhi Zheng, Wenshuo Chao, Zhaopeng Qiu, Hengshu Zhu, Hui Xiong
- Quality Of Answers Of Generative Large Language Models Vs Peer Patients For Interpreting Lab Test Results For Lay Patients: Evaluation Study Zhe He et al.
- Let Me Do It For You: Towards LLM Empowered Recommendation Via Tool Learning Yuyue Zhao et al.
- A Survey On Lora Of Large Language Models Yuren Mao et al.
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- Large Language Models In Mental Health Care: A Scoping Review Yining Hua et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Llamafactory: Unified Efficient Fine-tuning Of 100+ Language Models Yaowei Zheng et al.
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Prompting Large Language Models With Rationale Heuristics For Knowledge-based Visual Question Answering Zhongjian Hu, Peng Yang, Bing Li, Fengyuan Liu
- Measurement Of Llm's Philosophies Of Human Nature Minheng Ni et al.
🏷 Training Techniques
- Programming With A Differentiable Forth Interpreter Matko Bošnjak, Tim Rocktäschel, Jason Naradowsky, Sebastian Riedel
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- An Actor-critic Algorithm For Sequence Prediction Dzmitry Bahdanau et al.
- Steering Output Style And Topic In Neural Response Generation Di Wang, Nebojsa Jojic, Chris Brockett, Eric Nyberg
- Attention Is All You Need Ashish Vaswani et al.
- Mojitalk: Generating Emotional Responses At Scale Xianda Zhou, William Yang Wang
- Weighted Transformer Network For Machine Translation Karim Ahmed, Nitish Shirish Keskar, Richard Socher
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- Adversarial Learning For Neural Dialogue Generation Jiwei Li et al.
- Data Distillation For Controlling Specificity In Dialogue Generation Jiwei Li, Will Monroe, Dan Jurafsky
- A Unified Query-based Generative Model For Question Generation And Question Answering Linfeng Song, Zhiguo Wang, Wael Hamza
- Neural Response Generation With Dynamic Vocabularies Yu Wu et al.
- Parlai: A Dialog Research Software Platform Alexander H. Miller et al.
- Fine Grained Knowledge Transfer For Personalized Task-oriented Dialogue Systems Kaixiang Mo, Yu Zhang, Qiang Yang, Pascale Fung
- Maskgan: Better Text Generation Via Filling In The______ William Fedus, Ian Goodfellow, Andrew M. Dai
- Dialogue Generation: From Imitation Learning To Inverse Reinforcement Learning Ziming Li, Julia Kiseleva, Maarten De Rijke
- Fast Abstractive Summarization With Reinforce-selected Sentence Rewriting Yen-chun Chen, Mohit Bansal
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- Improving Machine Reading Comprehension With General Reading Strategies Kai Sun, Dian Yu, Dong Yu, Claire Cardie
- Extending Neural Generative Conversational Model Using External Knowledge Sources Prasanna Parthasarathi, Joelle Pineau
- Adversarially Regularising Neural NLI Models To Integrate Logical Background Knowledge Pasquale Minervini, Sebastian Riedel
- Learn To Code-switch: Data Augmentation Using Copy Mechanism On Language Modeling Genta Indra Winata, Andrea Madotto, Chien-sheng Wu, Pascale Fung
- Adversarial Over-sensitivity And Over-stability Strategies For Dialogue Models Tong Niu, Mohit Bansal
- Can You Tell Me How To Get Past Sesame Street? Sentence-level Pretraining Beyond Language Modeling Alex Wang et al.
- A Retrieve-and-edit Framework For Predicting Structured Outputs Tatsunori B. Hashimoto, Kelvin Guu, Yonatan Oren, Percy Liang
- Skeleton-to-response: Dialogue Generation Guided By Retrieval Memory Deng Cai et al.
- Language Gans Falling Short Massimo Caccia et al.
- Sentence Encoders On Stilts: Supplementary Training On Intermediate Labeled-data Tasks Jason Phang, Thibault Févry, Samuel R. Bowman
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Improving The Transformer Translation Model With Document-level Context Jiacheng Zhang et al.
- Retrieval-enhanced Adversarial Training For Neural Response Generation Qingfu Zhu, Lei Cui, Weinan Zhang, Furu Wei, Ting Liu
- Training Tips For The Transformer Model Martin Popel, Ondřej Bojar
- Training Millions Of Personalized Dialogue Agents Pierre-emmanuel Mazaré, Samuel Humeau, Martin Raison, Antoine Bordes
- Emrqa: A Large Corpus For Question Answering On Electronic Medical Records Anusri Pampari, Preethi Raghavan, Jennifer Liang, Jian Peng
- Zero-shot Adaptive Transfer For Conversational Language Understanding Sungjin Lee, Rahul Jha
- Generating Informative And Diverse Conversational Responses Via Adversarial Information Maximization Yizhe Zhang et al.
- Another Diversity-promoting Objective Function For Neural Dialogue Generation Ryo Nakamura, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Towards Empathetic Open-domain Conversation Models: A New Benchmark And Dataset Hannah Rashkin, Eric Michael Smith, Margaret Li, Y-lan Boureau
- Simple Fusion: Return Of The Language Model Felix Stahlberg, James Cross, Veselin Stoyanov
- BERT: Pre-training Of Deep Bidirectional Transformers For Language Understanding Jacob Devlin, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Multi-passage BERT: A Globally Normalized BERT Model For Open-domain Question Answering Zhiguo Wang, Patrick Ng, Xiaofei Ma, Ramesh Nallapati, Bing Xiang
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Ensemble-based Deep Reinforcement Learning For Chatbots Heriberto Cuayáhuitl et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- Efficient Adaptation Of Pretrained Transformers For Abstractive Summarization Andrew Hoang, Antoine Bosselut, Asli Celikyilmaz, Yejin Choi
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- Cross-lingual Language Model Pretraining Guillaume Lample, Alexis Conneau
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Bert4rec: Sequential Recommendation With Bidirectional Encoder Representations From Transformer Fei Sun et al.
- Entity-consistent End-to-end Task-oriented Dialogue System With KB Retriever Libo Qin et al.
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Align, Mask And Select: A Simple Method For Incorporating Commonsense Knowledge Into Language Representation Models Zhi-xiu Ye, Qian Chen, Wen Wang, Zhen-hua Ling
- Olmpics -- On What Language Model Pre-training Captures Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant
- Multiqa: An Empirical Investigation Of Generalization And Transfer In Reading Comprehension Alon Talmor, Jonathan Berant
- Nemo: A Toolkit For Building AI Applications Using Neural Modules Oleksii Kuchaiev et al.
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Mixture Content Selection For Diverse Sequence Generation Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- The Curious Case Of Neural Text Degeneration Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, Yejin Choi
- Domain Adaptive Dialog Generation Via Meta Learning Kun Qian, Zhou Yu
- Pretrained Language Models For Document-level Neural Machine Translation Liangyou Li, Xin Jiang, Qun Liu
- Cross-lingual Natural Language Generation Via Pre-training Zewen Chi et al.
- PEGASUS: Pre-training With Extracted Gap-sentences For Abstractive Summarization Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu
- Non-monotonic Sequential Text Generation Sean Welleck, Kianté Brantley, Hal Iii Daumé, Kyunghyun Cho
- LAMOL: Language Modeling For Lifelong Language Learning Fan-keng Sun, Cheng-hao Ho, Hung-yi Lee
- Good-enough Compositional Data Augmentation Jacob Andreas
- Language Models As Knowledge Bases? Fabio Petroni et al.
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- Sticking To The Facts: Confident Decoding For Faithful Data-to-text Generation Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Multifit: Efficient Multi-lingual Language Model Fine-tuning Julian Martin Eisenschlos et al.
- Camembert: A Tasty French Language Model Louis Martin et al.
- Encode, Tag, Realize: High-precision Text Editing Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- UER: An Open-source Toolkit For Pre-training Models Zhe Zhao et al.
- Consistent Dialogue Generation With Self-supervised Feature Learning Yizhe Zhang et al.
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection Guangxiang Zhao et al.
- A Pre-training Based Personalized Dialogue Generation Model With Persona-sparse Data Yinhe Zheng, Rongsheng Zhang, Xiaoxi Mao, Minlie Huang
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- Training Neural Response Selection For Task-oriented Dialogue Systems Matthew Henderson et al.
- Cloze-driven Pretraining Of Self-attention Networks Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Zero: Memory Optimizations Toward Training Trillion Parameter Models Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Transfertransfo: A Transfer Learning Approach For Neural Network Based Conversational Agents Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
- An Effective Domain Adaptive Post-training Method For BERT In Response Selection Taesun Whang et al.
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- What Would Elsa Do? Freezing Layers During Transformer Fine-tuning Jaejun Lee, Raphael Tang, Jimmy Lin
- Adding Interpretable Attention To Neural Translation Models Improves Word Alignment Thomas Zenkel, Joern Wuebker, John Denero
- VL-BERT: Pre-training Of Generic Visual-linguistic Representations Weijie Su et al.
- Winogrande: An Adversarial Winograd Schema Challenge At Scale Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Codegru: Context-aware Deep Learning With Gated Recurrent Unit For Source Code Modeling Yasir Hussain, Zhiqiu Huang, Yu Zhou, Senzhang Wang
- Learning From Explanations With Neural Execution Tree Ziqi Wang et al.
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Learning From Dialogue After Deployment: Feed Yourself, Chatbot! Braden Hancock, Antoine Bordes, Pierre-emmanuel Mazaré, Jason Weston
- A Simple But Effective Method To Incorporate Multi-turn Context With BERT For Conversational Machine Comprehension Yasuhito Ohsugi, Itsumi Saito, Kyosuke Nishida, Hisako Asano, Junji Tomita
- Scheduled Sampling For Transformers Tsvetomila Mihaylova, André F. T. Martins
- Levenshtein Transformer Jiatao Gu, Changhan Wang, Jake Zhao
- On The Use Of BERT For Neural Machine Translation Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
- QASC: A Dataset For Question Answering Via Sentence Composition Tushar Khot, Peter Clark, Michal Guerquin, Peter Jansen, Ashish Sabharwal
- Build It Break It Fix It For Dialogue Safety: Robustness From Adversarial Human Attack Emily Dinan, Samuel Humeau, Bharath Chintagunta, Jason Weston
- Enabling Robots To Understand Incomplete Natural Language Instructions Using Commonsense Reasoning Haonan Chen, Hao Tan, Alan Kuntz, Mohit Bansal, Ron Alterovitz
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Multi-hop Question Answering Via Reasoning Chains Jifan Chen, Shih-ting Lin, Greg Durrett
- CTRL: A Conditional Transformer Language Model For Controllable Generation Nitish Shirish Keskar, Bryan Mccann, Lav R. Varshney, Caiming Xiong, Richard Socher
- Paraphrasing With Large Language Models Sam Witteveen, Martin Andrews
- Semantics-aware BERT For Language Understanding Zhuosheng Zhang et al.
- Data Augmentation For BERT Fine-tuning In Open-domain Question Answering Wei Yang et al.
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Retrieve, Read, Rerank: Towards End-to-end Multi-document Reading Comprehension Minghao Hu, Yuxing Peng, Zhen Huang, Dongsheng Li
- Learning And Evaluating General Linguistic Intelligence Dani Yogatama et al.
- Dialogpt: Large-scale Generative Pre-training For Conversational Response Generation Yizhe Zhang et al.
- 12-in-1: Multi-task Vision And Language Representation Learning Jiasen Lu, Vedanuj Goswami, Marcus Rohrbach, Devi Parikh, Stefan Lee
- Parallel Scheduled Sampling Daniel Duckworth, Arvind Neelakantan, Ben Goodrich, Lukasz Kaiser, Samy Bengio
- Inducing Brain-relevant Bias In Natural Language Processing Models Dan Schwartz, Mariya Toneva, Leila Wehbe
- Parameter-efficient Transfer Learning For NLP Neil Houlsby et al.
- Automatic Spanish Translation Of The Squad Dataset For Multilingual Question Answering Casimiro Pio Carrino, Marta R. Costa-jussà, José A. R. Fonollosa
- Visualizing And Understanding The Effectiveness Of BERT Yaru Hao, Li Dong, Furu Wei, Ke Xu
- Do Attention Heads In BERT Track Syntactic Dependencies? Phu Mon Htut, Jason Phang, Shikha Bordia, Samuel R. Bowman
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Fine-tuning Language Models From Human Preferences Daniel M. Ziegler et al.
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Barack's Wife Hillary: Using Knowledge-graphs For Fact-aware Language Modeling Robert L. Iv Logan, Nelson F. Liu, Matthew E. Peters, Matt Gardner, Sameer Singh
- Learning And Evaluating Contextual Embedding Of Source Code Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
- Unsupervised Question Answering By Cloze Translation Patrick Lewis, Ludovic Denoyer, Sebastian Riedel
- Linguistic Knowledge And Transferability Of Contextual Representations Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Plug And Play Language Models: A Simple Approach To Controlled Text Generation Sumanth Dathathri et al.
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- MLQA: Evaluating Cross-lingual Extractive Question Answering Patrick Lewis, Barlas Oğuz, Ruty Rinott, Sebastian Riedel, Holger Schwenk
- What Makes A Good Conversation? How Controllable Attributes Affect Human Judgments Abigail See, Stephen Roller, Douwe Kiela, Jason Weston
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Adapting And Evaluating A Deep Learning Language Model For Clinical Why-question Answering Andrew Wen, Mohamed Y. Elwazir, Sungrim Moon, Jungwei Fan
- Roberta: A Robustly Optimized BERT Pretraining Approach Yinhan Liu et al.
- Fast Transformer Decoding: One Write-head Is All You Need Noam Shazeer
- GLTR: Statistical Detection And Visualization Of Generated Text Sebastian Gehrmann, Hendrik Strobelt, Alexander M. Rush
- Learning To Select Knowledge For Response Generation In Dialog Systems Rongzhong Lian, Min Xie, Fan Wang, Jinhua Peng, Hua Wu
- Juice: A Large Scale Distantly Supervised Dataset For Open Domain Context-based Code Generation Rajas Agashe, Srinivasan Iyer, Luke Zettlemoyer
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- Gmail Smart Compose: Real-time Assisted Writing Mia Xu Chen et al.
- Explain Yourself! Leveraging Language Models For Commonsense Reasoning Nazneen Fatema Rajani, Bryan Mccann, Caiming Xiong, Richard Socher
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Synthetic QA Corpora Generation With Roundtrip Consistency Chris Alberti, Daniel Andor, Emily Pitler, Jacob Devlin, Michael Collins
- Microsoft Translator At WMT 2019: Towards Large-scale Document-level Neural Machine Translation Marcin Junczys-dowmunt
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- ALBERT: A Lite BERT For Self-supervised Learning Of Language Representations Zhenzhong Lan et al.
- Don't Say That! Making Inconsistent Dialogue Unlikely With Unlikelihood Training Margaret Li et al.
- Countering Language Drift Via Visual Grounding Jason Lee, Kyunghyun Cho, Douwe Kiela
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- What Does BERT Learn From Multiple-choice Reading Comprehension Datasets? Chenglei Si, Shuohang Wang, Min-yen Kan, Jing Jiang
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Transfer Fine-tuning: A BERT Case Study Yuki Arase, Junichi Tsujii
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- Pretrained Encyclopedia: Weakly Supervised Knowledge-pretrained Language Model Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Fairseq: A Fast, Extensible Toolkit For Sequence Modeling Myle Ott et al.
- TANDA: Transfer And Adapt Pre-trained Transformer Models For Answer Sentence Selection Siddhant Garg, Thuy Vu, Alessandro Moschitti
- Exploring The Limits Of Transfer Learning With A Unified Text-to-text Transformer Colin Raffel et al.
- Robust Navigation With Language Pretraining And Stochastic Sampling Xiujun Li et al.
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- Attention-informed Mixed-language Training For Zero-shot Cross-lingual Task-oriented Dialogue Systems Zihan Liu, Genta Indra Winata, Zhaojiang Lin, Peng Xu, Pascale Fung
- Harnessing Evolution Of Multi-turn Conversations For Effective Answer Retrieval Mohammad Aliannejadi, Manajit Chakraborty, Esteban Andrés Ríssola, Fabio Crestani
- Modifying Memories In Transformer Models Chen Zhu et al.
- UNIMO: Towards Unified-modal Understanding And Generation Via Cross-modal Contrastive Learning Wei Li et al.
- Low-rank Bottleneck In Multi-head Attention Models Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- Controlled Hallucinations: Learning To Generate Faithfully From Noisy Data Katja Filippova
- Modelling Hierarchical Structure Between Dialogue Policy And Natural Language Generator With Option Framework For Task-oriented Dialogue System Jianhong Wang, Yuan Zhang, Tae-kyun Kim, Yunjie Gu
- XGLUE: A New Benchmark Dataset For Cross-lingual Pre-training, Understanding And Generation Yaobo Liang et al.
- Linformer: Self-attention With Linear Complexity Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, Hao Ma
- To Pretrain Or Not To Pretrain: Examining The Benefits Of Pretraining On Resource Rich Tasks Sinong Wang, Madian Khabsa, Hao Ma
- Bert-of-theseus: Compressing BERT By Progressive Module Replacing Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou
- Transformers As Soft Reasoners Over Language Peter Clark, Oyvind Tafjord, Kyle Richardson
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- Knowledge-driven Data Construction For Zero-shot Evaluation In Commonsense Question Answering Kaixin Ma et al.
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- KRISP: Integrating Implicit And Symbolic Knowledge For Open-domain Knowledge-based VQA Kenneth Marino, Xinlei Chen, Devi Parikh, Abhinav Gupta, Marcus Rohrbach
- Improved Natural Language Generation Via Loss Truncation Daniel Kang, Tatsunori Hashimoto
- Unifiedqa: Crossing Format Boundaries With A Single QA System Daniel Khashabi et al.
- Fine-tuning Pretrained Language Models: Weight Initializations, Data Orders, And Early Stopping Jesse Dodge et al.
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- Reducing Gender Bias In Neural Machine Translation As A Domain Adaptation Problem Danielle Saunders, Bill Byrne
- The Chess Transformer: Mastering Play Using Generative Language Models David Noever, Matt Ciolino, Josh Kalin
- Injecting Numerical Reasoning Skills Into Language Models Mor Geva, Ankit Gupta, Jonathan Berant
- Intermediate-task Transfer Learning With Pretrained Models For Natural Language Understanding: When And Why Does It Work? Yada Pruksachatkun et al.
- WT5?! Training Text-to-text Models To Explain Their Predictions Sharan Narang et al.
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- How Effective Is Task-agnostic Data Augmentation For Pretrained Transformers? Shayne Longpre, Yu Wang, Christopher Dubois
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- Knowledge-aware Language Model Pretraining Corby Rosset et al.
- Mapping Natural Language Instructions To Mobile UI Action Sequences Yang Li, Jiacong He, Xin Zhou, Yuan Zhang, Jason Baldridge
- Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies For Multi-turn Response Selection Taesun Whang et al.
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- Exploring Fine-tuning Techniques For Pre-trained Cross-lingual Models Via Continual Learning Zihan Liu, Genta Indra Winata, Andrea Madotto, Pascale Fung
- Fine-tuning Pre-trained Language Model With Weak Supervision: A Contrastive-regularized Self-training Approach Yue Yu et al.
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- A Knowledge-enhanced Pretraining Model For Commonsense Story Generation Jian Guan, Fei Huang, Zhihao Zhao, Xiaoyan Zhu, Minlie Huang
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- Exploring Versatile Generative Language Model Via Parameter-efficient Transfer Learning Zhaojiang Lin, Andrea Madotto, Pascale Fung
- Gpt-too: A Language-model-first Approach For Amr-to-text Generation Manuel Mager et al.
- How Good Is Your Tokenizer? On The Monolingual Performance Of Multilingual Language Models Phillip Rust, Jonas Pfeiffer, Ivan Vulić, Sebastian Ruder, Iryna Gurevych
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Knowledge Distillation For Improved Accuracy In Spoken Question Answering Chenyu You, Nuo Chen, Yuexian Zou
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Towards Learning A Generic Agent For Vision-and-language Navigation Via Pre-training Weituo Hao, Chunyuan Li, Xiujun Li, Lawrence Carin, Jianfeng Gao
- Few-shot Generative Conversational Query Rewriting Shi Yu et al.
- Explaining Question Answering Models Through Text Generation Veronica Latcinnik, Jonathan Berant
- Low-resource Knowledge-grounded Dialogue Generation Xueliang Zhao et al.
- Few-shot Text Generation With Pattern-exploiting Training Timo Schick, Hinrich Schütze
- Zero-resource Knowledge-grounded Dialogue Generation Linxiao Li et al.
- BANG: Bridging Autoregressive And Non-autoregressive Generation With Large Scale Pretraining Weizhen Qi et al.
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- It's Not Just Size That Matters: Small Language Models Are Also Few-shot Learners Timo Schick, Hinrich Schütze
- Sequence-level Mixed Sample Data Augmentation Demi Guo, Yoon Kim, Alexander M. Rush
- Coreferential Reasoning Learning For Language Representation Deming Ye et al.
- Learning To Recombine And Resample Data For Compositional Generalization Ekin Akyürek, Afra Feyza Akyürek, Jacob Andreas
- A Simple But Tough-to-beat Data Augmentation Approach For Natural Language Understanding And Generation Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- Alfworld: Aligning Text And Embodied Environments For Interactive Learning Mohit Shridhar et al.
- Inducing Language-agnostic Multilingual Representations Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein
- Mathematical Reasoning Via Self-supervised Skip-tree Training Markus N. Rabe, Dennis Lee, Kshitij Bansal, Christian Szegedy
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- Gshard: Scaling Giant Models With Conditional Computation And Automatic Sharding Dmitry Lepikhin et al.
- Logic2text: High-fidelity Natural Language Generation From Logical Forms Zhiyu Chen et al.
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- M3P: Learning Universal Representations Via Multitask Multilingual Multimodal Pre-training Minheng Ni et al.
- Scaling Laws For Neural Language Models Jared Kaplan et al.
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- PLATO-2: Towards Building An Open-domain Chatbot Via Curriculum Learning Siqi Bao et al.
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- Query Resolution For Conversational Search With Limited Supervision Nikos Voskarides, Dan Li, Pengjie Ren, Evangelos Kanoulas, Maarten De Rijke
- When Being Unseen From Mbert Is Just The Beginning: Handling New Languages With Multilingual Language Models Benjamin Muller, Antonis Anastasopoulos, Benoît Sagot, Djamé Seddah
- Training Large Neural Networks With Constant Memory Using A New Execution Algorithm Bharadwaj Pudipeddi, Maral Mesmakhosroshahi, Jinwen Xi, Sujeeth Bharadwaj
- Tabert: Pretraining For Joint Understanding Of Textual And Tabular Data Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel
- Gedi: Generative Discriminator Guided Sequence Generation Ben Krause et al.
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- Exploring And Predicting Transferability Across NLP Tasks Tu Vu et al.
- When Do You Need Billions Of Words Of Pretraining Data? Yian Zhang, Alex Warstadt, Haau-sing Li, Samuel R. Bowman
- Recipes For Safety In Open-domain Chatbots Jing Xu et al.
- Beyond I.I.D.: Three Levels Of Generalization For Question Answering On Knowledge Bases Yu Gu et al.
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Recipes For Building An Open-domain Chatbot Stephen Roller et al.
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- A Large-scale Chinese Short-text Conversation Dataset Yida Wang et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- Better Fine-tuning By Reducing Representational Collapse Armen Aghajanyan et al.
- Generative Data Augmentation For Commonsense Reasoning Yiben Yang et al.
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Adapterdrop: On The Efficiency Of Adapters In Transformers Andreas Rücklé et al.
- Language Models As Few-shot Learner For Task-oriented Dialogue Systems Andrea Madotto, Zihan Liu, Zhaojiang Lin, Pascale Fung
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- Autoprompt: Eliciting Knowledge From Language Models With Automatically Generated Prompts Taylor Shin, Yasaman Razeghi, Robert L. Iv Logan, Eric Wallace, Sameer Singh
- Syntactic Data Augmentation Increases Robustness To Inference Heuristics Junghyun Min, R. Thomas Mccoy, Dipanjan Das, Emily Pitler, Tal Linzen
- End-to-end Synthetic Data Generation For Domain Adaptation Of Question Answering Systems Siamak Shakeri et al.
- Multilingual Denoising Pre-training For Neural Machine Translation Yinhan Liu et al.
- What Happens To BERT Embeddings During Fine-tuning? Amil Merchant, Elahe Rahimtoroghi, Ellie Pavlick, Ian Tenney
- Ernie-doc: A Retrospective Long-document Modeling Transformer Siyu Ding et al.
- Are We Pretraining It Right? Digging Deeper Into Visio-linguistic Pretraining Amanpreet Singh, Vedanuj Goswami, Devi Parikh
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- Relevance-guided Supervision For Openqa With Colbert Omar Khattab, Christopher Potts, Matei Zaharia
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- Proofwriter: Generating Implications, Proofs, And Abductive Statements Over Natural Language Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark
- Cocon: A Self-supervised Approach For Controlled Text Generation Alvin Chan, Yew-soon Ong, Bill Pung, Aston Zhang, Jie Fu
- Leap-of-thought: Teaching Pre-trained Models To Systematically Reason Over Implicit Knowledge Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- Recall And Learn: Fine-tuning Deep Pretrained Language Models With Less Forgetting Sanyuan Chen et al.
- Coregen: Contextualized Code Representation Learning For Commit Message Generation Lun Yiu Nie et al.
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Auto-captions On GIF: A Large-scale Video-sentence Dataset For Vision-language Pre-training Yingwei Pan et al.
- Fine-tuning BERT For Schema-guided Zero-shot Dialogue State Tracking Yu-ping Ruan, Zhen-hua Ling, Jia-chen Gu, Quan Liu
- GREEK-BERT: The Greeks Visiting Sesame Street John Koutsikakis, Ilias Chalkidis, Prodromos Malakasiotis, Ion Androutsopoulos
- Template-based Question Generation From Retrieved Sentences For Improved Unsupervised Question Answering Alexander R. Fabbri, Patrick Ng, Zhiguo Wang, Ramesh Nallapati, Bing Xiang
- Contrastive Code Representation Learning Paras Jain et al.
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- Facts As Experts: Adaptable And Interpretable Neural Memory Over Symbolic Knowledge Pat Verga, Haitian Sun, Livio Baldini Soares, William W. Cohen
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- Measuring And Reducing Gendered Correlations In Pre-trained Models Kellie Webster et al.
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- Realtoxicityprompts: Evaluating Neural Toxic Degeneration In Language Models Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, Noah A. Smith
- Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training Weizhen Qi et al.
- Logic-guided Data Augmentation And Regularization For Consistent Question Answering Akari Asai, Hannaneh Hajishirzi
- Encoding Syntactic Knowledge In Transformer Encoder For Intent Detection And Slot Filling Jixuan Wang, Kai Wei, Martin Radfar, Weiwei Zhang, Clement Chung
- Question And Answer Test-train Overlap In Open-domain Question Answering Datasets Patrick Lewis, Pontus Stenetorp, Sebastian Riedel
- Mixkd: Towards Efficient Distillation Of Large-scale Language Models Kevin J Liang et al.
- Retrieval-augmented Generation For Knowledge-intensive NLP Tasks Patrick Lewis et al.
- ETC: Encoding Long And Structured Inputs In Transformers Joshua Ainslie et al.
- Countering Language Drift With Seeded Iterated Learning Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville
- How Much Knowledge Can You Pack Into The Parameters Of A Language Model? Adam Roberts, Colin Raffel, Noam Shazeer
- Schema-guided Dialogue State Tracking Task At DSTC8 Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta, Pranav Khaitan
- Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue Shikib Mehri, Mihail Eric, Dilek Hakkani-tur
- UBAR: Towards Fully End-to-end Task-oriented Dialog Systems With GPT-2 Yunyi Yang, Yunhao Li, Xiaojun Quan
- Nearest Neighbor Machine Translation Urvashi Khandelwal, Angela Fan, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis
- Language Models Are Few-shot Learners Tom B. Brown et al.
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- How Can We Know When Language Models Know? On The Calibration Of Language Models For Question Answering Zhengbao Jiang, Jun Araki, Haibo Ding, Graham Neubig
- Incorporating External Knowledge Through Pre-training For Natural Language To Code Generation Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig
- Mintl: Minimalist Transfer Learning For Task-oriented Dialogue Systems Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, Pascale Fung
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- Cosda-ml: Multi-lingual Code-switching Data Augmentation For Zero-shot Cross-lingual NLP Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che
- Human Instruction-following With Deep Reinforcement Learning Via Transfer-learning From Text Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- Pre-training Via Paraphrasing Mike Lewis et al.
- How Fine Can Fine-tuning Be? Learning Efficient Language Models Evani Radiya-dixit, Xin Wang
- Grounded Language Learning Fast And Slow Felix Hill et al.
- Variational Transformers For Diverse Response Generation Zhaojiang Lin, Genta Indra Winata, Peng Xu, Zihan Liu, Pascale Fung
- Robust Encodings: A Framework For Combating Adversarial Typos Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- CERT: Contrastive Self-supervised Learning For Language Understanding Hongchao Fang, Sicheng Wang, Meng Zhou, Jiayuan Ding, Pengtao Xie
- Multilingual Translation With Extensible Multilingual Pretraining And Finetuning Yuqing Tang et al.
- XLM-T: Scaling Up Multilingual Machine Translation With Pretrained Cross-lingual Transformer Encoders Shuming Ma et al.
- Length-adaptive Transformer: Train Once With Length Drop, Use Anytime With Search Gyuwan Kim, Kyunghyun Cho
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Residual Energy-based Models For Text Generation Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'aurelio Ranzato
- Contrastive Triple Extraction With Generative Transformer Hongbin Ye et al.
- Data Manipulation: Towards Effective Instance Learning For Neural Dialogue Generation Via Learning To Augment And Reweight Hengyi Cai et al.
- Training Question Answering Models From Synthetic Data Raul Puri, Ryan Spring, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro
- Very Deep Transformers For Neural Machine Translation Xiaodong Liu, Kevin Duh, Liyuan Liu, Jianfeng Gao
- Mixup-transformer: Dynamic Data Augmentation For NLP Tasks Lichao Sun et al.
- Mobilebert: A Compact Task-agnostic BERT For Resource-limited Devices Zhiqing Sun et al.
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- Document Ranking With A Pretrained Sequence-to-sequence Model Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- DAVE: Deriving Automatically Verilog From English Hammond Pearce, Benjamin Tan, Ramesh Karri
- Indic-transformers: An Analysis Of Transformer Language Models For Indian Languages Kushal Jain, Adwait Deshpande, Kumar Shridhar, Felix Laumann, Ayushman Dash
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Ruibo Liu et al.
- BERT Based Multilingual Machine Comprehension In English And Hindi Somil Gupta, Nilesh Khade
- Language Generation With Multi-hop Reasoning On Commonsense Knowledge Graph Haozhe Ji et al.
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- Can You Put It All Together: Evaluating Conversational Agents' Ability To Blend Skills Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-lan Boureau
- Controlling Style In Generated Dialogue Eric Michael Smith, Diana Gonzalez-rico, Emily Dinan, Y-lan Boureau
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- TAP: Text-aware Pre-training For Text-vqa And Text-caption Zhengyuan Yang et al.
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- On Optimal Transformer Depth For Low-resource Language Translation Elan Van Biljon, Arnu Pretorius, Julia Kreutzer
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- Unsupervised Evaluation Of Interactive Dialog With Dialogpt Shikib Mehri, Maxine Eskenazi
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Mt5: A Massively Multilingual Pre-trained Text-to-text Transformer Linting Xue et al.
- The Pile: An 800GB Dataset Of Diverse Text For Language Modeling Leo Gao et al.
- Will I Sound Like Me? Improving Persona Consistency In Dialogues Through Pragmatic Self-consciousness Hyunwoo Kim, Byeongchang Kim, Gunhee Kim
- Rethinking The Value Of Transformer Components Wenxuan Wang, Zhaopeng Tu
- Investigating Pretrained Language Models For Graph-to-text Generation Leonardo F. R. Ribeiro, Martin Schmitt, Hinrich Schütze, Iryna Gurevych
- As Good As New. How To Successfully Recycle English GPT-2 To Make Models For Other Languages Wietse De Vries, Malvina Nissim
- Funnel-transformer: Filtering Out Sequential Redundancy For Efficient Language Processing Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- Retrofitting Structure-aware Transformer Language Model For End Tasks Hao Fei, Yafeng Ren, Donghong Ji
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- An Exploratory Study On Long Dialogue Summarization: What Works And What's Next Yusen Zhang et al.
- Improving Coherence And Consistency In Neural Sequence Models With Dual-system, Neuro-symbolic Reasoning Maxwell Nye, Michael Henry Tessler, Joshua B. Tenenbaum, Brenden M. Lake
- Increasing Faithfulness In Knowledge-grounded Dialogue With Controllable Features Hannah Rashkin, David Reitter, Gaurav Singh Tomar, Dipanjan Das
- Few-shot Learning With Multilingual Language Models Xi Victoria Lin et al.
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- Bias Out-of-the-box: An Empirical Analysis Of Intersectional Occupational Biases In Popular Generative Language Models Hannah Kirk et al.
- On Transferability Of Prompt Tuning For Natural Language Processing Yusheng Su et al.
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- Truthfulqa: Measuring How Models Mimic Human Falsehoods Stephanie Lin, Jacob Hilton, Owain Evans
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- Codified Audio Language Modeling Learns Useful Representations For Music Information Retrieval Rodrigo Castellon, Chris Donahue, Percy Liang
- Wenlan: Bridging Vision And Language By Large-scale Multi-modal Pre-training Yuqi Huo et al.
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- P-tuning V2: Prompt Tuning Can Be Comparable To Fine-tuning Universally Across Scales And Tasks Xiao Liu et al.
- GPT Understands, Too Xiao Liu et al.
- Learning How To Ask: Querying Lms With Mixtures Of Soft Prompts Guanghui Qin, Jason Eisner
- Contrastive Learning For Many-to-many Multilingual Neural Machine Translation Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
- G-transformer For Document-level Machine Translation Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo
- Unifying Multimodal Transformer For Bi-directional Image And Text Generation Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- Cutting Down On Prompts And Parameters: Simple Few-shot Learning With Language Models Robert L. Iv Logan et al.
- Mention Memory: Incorporating Textual Knowledge Into Transformers Through Entity Mention Attention Michiel De Jong, Yury Zemlyanskiy, Nicholas Fitzgerald, Fei Sha, William Cohen
- Robeczech: Czech Roberta, A Monolingual Contextualized Language Representation Model Milan Straka, Jakub Náplava, Jana Straková, David Samuel
- Crossing The Conversational Chasm: A Primer On Natural Language Processing For Multilingual Task-oriented Dialogue Systems Evgeniia Razumovskaia et al.
- True Few-shot Learning With Language Models Ethan Perez, Douwe Kiela, Kyunghyun Cho
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- Multimodal Dialogue Response Generation Qingfeng Sun et al.
- Advancing High-resolution Video-language Representation With Large-scale Video Transcriptions Hongwei Xue et al.
- Wangchanberta: Pretraining Transformer-based Thai Language Models Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai, Sarana Nutanong
- Explaining Documents' Relevance To Search Queries Razieh Rahimi, Youngwoo Kim, Hamed Zamani, James Allan
- MWP-BERT: Numeracy-augmented Pre-training For Math Word Problem Solving Zhenwen Liang et al.
- Bob: BERT Over BERT For Training Persona-based Dialogue Models From Limited Personalized Data Haoyu Song, Yan Wang, Kaiyan Zhang, Wei-nan Zhang, Ting Liu
- Learning Rich Representation Of Keyphrases From Text Mayank Kulkarni, Debanjan Mahata, Ravneet Arora, Rajarshi Bhowmik
- Retrieval Augmentation Reduces Hallucination In Conversation Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, Jason Weston
- EVA: An Open-domain Chinese Dialogue System With Large-scale Generative Pre-training Hao Zhou et al.
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- Structural Adapters In Pretrained Language Models For Amr-to-text Generation Leonardo F. R. Ribeiro, Yue Zhang, Iryna Gurevych
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Multitask Prompted Training Enables Zero-shot Task Generalization Victor Sanh et al.
- Thank You BART! Rewarding Pre-trained Models Improves Formality Style Transfer Huiyuan Lai, Antonio Toral, Malvina Nissim
- Unsupervised Corpus Aware Language Model Pre-training For Dense Passage Retrieval Luyu Gao, Jamie Callan
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Mitigating Political Bias In Language Models Through Reinforced Calibration Ruibo Liu et al.
- Process For Adapting Language Models To Society (PALMS) With Values-targeted Datasets Irene Openai Solaiman, Christy Openai Dennison
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Bitod: A Bilingual Multi-domain Dataset For Task-oriented Dialogue Modeling Zhaojiang Lin et al.
- Vision-and-language Or Vision-for-language? On Cross-modal Influence In Multimodal Transformers Stella Frank, Emanuele Bugliarello, Desmond Elliott
- Revisiting The Primacy Of English In Zero-shot Cross-lingual Transfer Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-wei Chang, Kristina Toutanova
- Pretrained Transformers As Universal Computation Engines Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- Using Prior Knowledge To Guide Bert's Attention In Semantic Textual Matching Tasks Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
- GPT-3 Models Are Poor Few-shot Learners In The Biomedical Domain Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald
- Unipelt: A Unified Framework For Parameter-efficient Language Model Tuning Yuning Mao et al.
- A Recipe For Arbitrary Text Style Transfer With Large Language Models Emily Reif et al.
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- All That's 'human' Is Not Gold: Evaluating Human Evaluation Of Generated Text Elizabeth Clark et al.
- How Should Pre-trained Language Models Be Fine-tuned Towards Adversarial Robustness? Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Efficient Large Scale Language Modeling With Mixtures Of Experts Mikel Artetxe et al.
- Investigating The Limitations Of Transformers With Simple Arithmetic Tasks Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Improving Gender Fairness Of Pre-trained Language Models Without Catastrophic Forgetting Zahra Fatemi, Chen Xing, Wenhao Liu, Caiming Xiong
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- Sequence Length Is A Domain: Length-based Overfitting In Transformer Models Dušan Variš, Ondřej Bojar
- XLM-E: Cross-lingual Language Model Pre-training Via ELECTRA Zewen Chi et al.
- Clip4caption: CLIP For Video Caption Mingkang Tang et al.
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- End-to-end Training Of Multi-document Reader And Retriever For Open-domain Question Answering Devendra Singh Sachan, Siva Reddy, William Hamilton, Chris Dyer, Dani Yogatama
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Self-guided Contrastive Learning For BERT Sentence Representations Taeuk Kim, Kang Min Yoo, Sang-goo Lee
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- Improving And Simplifying Pattern Exploiting Training Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, Colin Raffel
- Internet-augmented Dialogue Generation Mojtaba Komeili, Kurt Shuster, Jason Weston
- Efficient Large-scale Language Model Training On GPU Clusters Using Megatron-lm Deepak Narayanan et al.
- Compressing Visual-linguistic Model Via Knowledge Distillation Zhiyuan Fang et al.
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- UC2: Universal Cross-lingual Cross-modal Vision-and-language Pre-training Mingyang Zhou et al.
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- PPT: Pre-trained Prompt Tuning For Few-shot Learning Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- Ext5: Towards Extreme Multi-task Scaling For Transfer Learning Vamsi Aribandi et al.
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- Zero-shot Recommendation As Language Modeling Damien Sileo, Wout Vossen, Robbe Raymaekers
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- Differentially Private Fine-tuning Of Language Models Da Yu et al.
- Long-span Summarization Via Local Attention And Content Selection Potsawee Manakul, Mark J. F. Gales
- The Stability-efficiency Dilemma: Investigating Sequence Length Warmup For Training GPT Models Conglong Li, Minjia Zhang, Yuxiong He
- Why Do Pretrained Language Models Help In Downstream Tasks? An Analysis Of Head And Prompt Tuning Colin Wei, Sang Michael Xie, Tengyu Ma
- MAGMA -- Multimodal Augmentation Of Generative Models Through Adapter-based Finetuning Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
- What To Pre-train On? Efficient Intermediate Task Selection Clifton Poth, Jonas Pfeiffer, Andreas Rücklé, Iryna Gurevych
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Meta-learning Via Language Model In-context Tuning Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He
- Can Generative Pre-trained Language Models Serve As Knowledge Bases For Closed-book QA? Cunxiang Wang, Pai Liu, Yue Zhang
- LAION-400M: Open Dataset Of Clip-filtered 400 Million Image-text Pairs Christoph Schuhmann et al.
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-image Pre-training Paradigm Yangguang Li et al.
- Counterfactual Memorization In Neural Language Models Chiyuan Zhang et al.
- Glam: Efficient Scaling Of Language Models With Mixture-of-experts Nan Du et al.
- Structurallm: Structural Pre-training For Form Understanding Chenliang Li et al.
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- Cross-task Generalization Via Natural Language Crowdsourcing Instructions Swaroop Mishra, Daniel Khashabi, Chitta Baral, Hannaneh Hajishirzi
- Prompting Visual-language Models For Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
- Terapipe: Token-level Pipeline Parallelism For Training Large-scale Language Models Zhuohan Li et al.
- N\"UWA: Visual Synthesis Pre-training For Neural Visual World Creation Chenfei Wu et al.
- Generating Datasets With Pretrained Language Models Timo Schick, Hinrich Schütze
- Simvlm: Simple Visual Language Model Pretraining With Weak Supervision Zirui Wang et al.
- Fantastically Ordered Prompts And Where To Find Them: Overcoming Few-shot Prompt Order Sensitivity Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- Climatebert: A Pretrained Language Model For Climate-related Text Nicolas Webersinke, Mathias Kraus, Julia Anna Bingler, Markus Leippold
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- Adversarial GLUE: A Multi-task Benchmark For Robustness Evaluation Of Language Models Boxin Wang et al.
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- Neural Path Hunter: Reducing Hallucination In Dialogue Systems Via Path Grounding Nouha Dziri, Andrea Madotto, Osmar Zaiane, Avishek Joey Bose
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- Scale Efficiently: Insights From Pre-training And Fine-tuning Transformers Yi Tay et al.
- Medically Aware GPT-3 As A Data Generator For Medical Dialogue Summarization Bharath Chintagunta, Namit Katariya, Xavier Amatriain, Anitha Kannan
- Are Pre-trained Convolutions Better Than Pre-trained Transformers? Yi Tay et al.
- Towards Few-shot Fact-checking Via Perplexity Nayeon Lee, Yejin Bang, Andrea Madotto, Madian Khabsa, Pascale Fung
- Maria: Spanish Language Models Asier Gutiérrez-fandiño et al.
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Hindsight: Posterior-guided Training Of Retrievers For Improved Open-ended Generation Ashwin Paranjape, Omar Khattab, Christopher Potts, Matei Zaharia, Christopher D. Manning
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- HTLM: Hyper-text Pre-training And Prompting Of Language Models Armen Aghajanyan et al.
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- Wordcraft: A Human-ai Collaborative Editor For Story Writing Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- Learning To Retrieve Prompts For In-context Learning Ohad Rubin, Jonathan Herzig, Jonathan Berant
- GLM: General Language Model Pretraining With Autoregressive Blank Infilling Zhengxiao Du et al.
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- Demix Layers: Disentangling Domains For Modular Language Modeling Suchin Gururangan, Mike Lewis, Ari Holtzman, Noah A. Smith, Luke Zettlemoyer
- Few-shot Bot: Prompt-based Learning For Dialogue Systems Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung
- Denseclip: Language-guided Dense Prediction With Context-aware Prompting Yongming Rao et al.
- Revisiting Self-training For Few-shot Learning Of Language Model Yiming Chen et al.
- COCO-LM: Correcting And Contrasting Text Sequences For Language Model Pretraining Yu Meng et al.
- A General Language Assistant As A Laboratory For Alignment Amanda Askell et al.
- FLAVA: A Foundational Language And Vision Alignment Model Amanpreet Singh et al.
- Few-shot Question Answering By Pretraining Span Selection Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy
- General-purpose Question-answering With Macaw Oyvind Tafjord, Peter Clark
- Baleen: Robust Multi-hop Reasoning At Scale Via Condensed Retrieval Omar Khattab, Christopher Potts, Matei Zaharia
- Task-oriented Dialogue System As Natural Language Generation Weizhi Wang et al.
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- Episodic Transformer For Vision-and-language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- KM-BART: Knowledge Enhanced Multimodal BART For Visual Commonsense Generation Yiran Xing et al.
- Multi-task Pre-training For Plug-and-play Task-oriented Dialogue System Yixuan Su et al.
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- MT6: Multilingual Pretrained Text-to-text Transformer With Translation Pairs Zewen Chi et al.
- One Question Answering Model For Many Languages With Cross-lingual Dense Passage Retrieval Akari Asai, Xinyan Yu, Jungo Kasai, Hannaneh Hajishirzi
- Indicbart: A Pre-trained Model For Indic Natural Language Generation Raj Dabre et al.
- How Many Data Points Is A Prompt Worth? Teven Le Scao, Alexander M. Rush
- Commitbert: Commit Message Generation Using Pre-trained Programming Language Model Tae-hwan Jung
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- Embodied BERT: A Transformer Model For Embodied, Language-guided Visual Task Completion Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme
- Visqa: X-raying Vision And Language Reasoning In Transformers Theo Jaunet et al.
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- FLEX: Unifying Evaluation For Few-shot NLP Jonathan Bragg, Arman Cohan, Kyle Lo, Iz Beltagy
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Unlocking Compositional Generalization In Pre-trained Models Using Intermediate Representations Jonathan Herzig et al.
- Open Domain Question Answering Over Tables Via Dense Retrieval Jonathan Herzig, Thomas Müller, Syrine Krichene, Julian Martin Eisenschlos
- A Simple Recipe For Multilingual Grammatical Error Correction Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause, Aliaksei Severyn
- OPT: Omni-perception Pre-trainer For Cross-modal Understanding And Generation Jing Liu et al.
- Using Adversarial Attacks To Reveal The Statistical Bias In Machine Reading Comprehension Models Jieyu Lin, Jiajie Zou, Nai Ding
- Learned Token Pruning For Transformers Sehoon Kim et al.
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- Rethink Training Of BERT Rerankers In Multi-stage Retrieval Pipeline Luyu Gao, Zhuyun Dai, Jamie Callan
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- Taming Sparsely Activated Transformer With Stochastic Experts Simiao Zuo et al.
- Deltalm: Encoder-decoder Pre-training For Language Generation And Translation By Augmenting Pretrained Multilingual Encoders Shuming Ma et al.
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- Metaicl: Learning To Learn In Context Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi
- Long Text Generation By Modeling Sentence-level And Discourse-level Coherence Jian Guan et al.
- Condenser: A Pre-training Architecture For Dense Retrieval Luyu Gao, Jamie Callan
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- Planning With Learned Entity Prompts For Abstractive Summarization Shashi Narayan et al.
- Pangu-\(α\): Large-scale Autoregressive Pretrained Chinese Language Models With Auto-parallel Computation Wei Zeng et al.
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Unified Pre-training For Program Understanding And Generation Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-wei Chang
- Recursively Summarizing Books With Human Feedback Jeff Wu et al.
- Compacter: Efficient Low-rank Hypercomplex Adapter Layers Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder
- GALAXY: A Generative Pre-trained Model For Task-oriented Dialog With Semi-supervised Learning And Explicit Policy Injection Wanwei He et al.
- Variational Information Bottleneck For Effective Low-resource Fine-tuning Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Scaling Language Models: Methods, Analysis & Insights From Training Gopher Jack W. Rae et al.
- Program Synthesis With Large Language Models Jacob Austin et al.
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Longt5: Efficient Text-to-text Transformer For Long Sequences Mandy Guo et al.
- How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty In Text Generation Using RAVEN R. Thomas Mccoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz
- Webgpt: Browser-assisted Question-answering With Human Feedback Reiichiro Nakano et al.
- Redditbias: A Real-world Resource For Bias Evaluation And Debiasing Of Conversational Language Models Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- Hurdles To Progress In Long-form Question Answering Kalpesh Krishna, Aurko Roy, Mohit Iyyer
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Learning To Prompt For Vision-language Models Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
- Adapting Language Models For Zero-shot Learning By Meta-tuning On Dataset And Prompt Collections Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- Pretrained Language Models For Text Generation: A Survey Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-rong Wen
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Indonlg: Benchmark And Resources For Evaluating Indonesian Natural Language Generation Samuel Cahyawijaya et al.
- Byt5: Towards A Token-free Future With Pre-trained Byte-to-byte Models Linting Xue et al.
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- Robertuito: A Pre-trained Language Model For Social Media Text In Spanish Juan Manuel Pérez, Damián A. Furman, Laura Alonso Alemany, Franco Luque
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- On The Effectiveness Of Adapter-based Tuning For Pretrained Language Model Adaptation Ruidan He et al.
- Training Verifiers To Solve Math Word Problems Karl Cobbe et al.
- Raise A Child In Large Language Model: Towards Effective And Generalizable Fine-tuning Runxin Xu et al.
- Diverse Demonstrations Improve In-context Compositional Generalization Itay Levy, Ben Bogin, Jonathan Berant
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- Reacc: A Retrieval-augmented Code Completion Framework Shuai Lu et al.
- Coderl: Mastering Code Generation Through Pretrained Models And Deep Reinforcement Learning Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- Perturbation Augmentation For Fairer NLP Rebecca Qian et al.
- Vision-and-language Pretrained Models: A Survey Siqu Long, Feiqi Cao, Soyeon Caren Han, Haiqin Yang
- Robotic Skill Acquisition Via Instruction Augmentation With Vision-language Models Ted Xiao et al.
- On The Effect Of Pretraining Corpora On In-context Learning By A Large-scale Language Model Seongjin Shin et al.
- CLIP-TD: CLIP Targeted Distillation For Vision-language Tasks Zhecan Wang et al.
- Biobart: Pretraining And Evaluation Of A Biomedical Generative Language Model Hongyi Yuan et al.
- One Embedder, Any Task: Instruction-finetuned Text Embeddings Hongjin Su et al.
- Interactive And Visual Prompt Engineering For Ad-hoc Task Adaptation With Large Language Models Hendrik Strobelt et al.
- Interleaving Retrieval With Chain-of-thought Reasoning For Knowledge-intensive Multi-step Questions Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- Repair Is Nearly Generation: Multilingual Program Repair With Llms Harshit Joshi et al.
- Inner Monologue: Embodied Reasoning Through Planning With Language Models Wenlong Huang et al.
- What Do Llms Know About Financial Markets? A Case Study On Reddit Market Sentiment Analysis Xiang Deng, Vasilisa Bashlovkina, Feng Han, Simon Baumgartner, Michael Bendersky
- Enabling Multimodal Generation On CLIP Via Vision-language Knowledge Distillation Wenliang Dai et al.
- Language Models As Zero-shot Planners: Extracting Actionable Knowledge For Embodied Agents Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch
- EVA2.0: Investigating Open-domain Chinese Dialogue Systems With Large-scale Pre-training Yuxian Gu et al.
- Few-shot Parameter-efficient Fine-tuning Is Better And Cheaper Than In-context Learning Haokun Liu et al.
- Cogvideo: Large-scale Pretraining For Text-to-video Generation Via Transformers Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang
- Prompt Tuning For Generative Multimodal Pretrained Models Hao Yang et al.
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- A Length-extrapolatable Transformer Yutao Sun et al.
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- Rethinking With Retrieval: Faithful Large Language Model Inference Hangfeng He, Hongming Zhang, Dan Roth
- Decoupling Knowledge From Memorization: Retrieval-augmented Prompt Learning Xiang Chen et al.
- Pali: A Jointly-scaled Multilingual Language-image Model Xi Chen et al.
- Revisiting Parameter-efficient Tuning: Are We Really There Yet? Guanzheng Chen, Fangyu Liu, Zaiqiao Meng, Shangsong Liang
- Smoothquant: Accurate And Efficient Post-training Quantization For Large Language Models Guangxuan Xiao et al.
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- LUT-GEMM: Quantized Matrix Multiplication Based On Luts For Efficient Inference In Large-scale Generative Language Models Gunho Park et al.
- Diffusion-lm Improves Controllable Text Generation Xiang Lisa Li, John Thickstun, Ishaan Gulrajani, Percy Liang, Tatsunori B. Hashimoto
- Atlas: Few-shot Learning With Retrieval Augmented Language Models Gautier Izacard et al.
- Prototypical Verbalizer For Prompt-based Few-shot Tuning Ganqu Cui, Shengding Hu, Ning Ding, Longtao Huang, Zhiyuan Liu
- Data Augmentation For Intent Classification With Off-the-shelf Large Language Models Gaurav Sahu et al.
- Contrastive Decoding: Open-ended Text Generation As Optimization Xiang Lisa Li et al.
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- On The Transferability Of Pre-trained Language Models For Low-resource Programming Languages Fuxiang Chen, Fatemeh Fard, David Lo, Timofey Bryksin
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Healthprompt: A Zero-shot Learning Paradigm For Clinical Natural Language Processing Sonish Sivarajkumar, Yanshan Wang
- Speechprompt: An Exploration Of Prompt Tuning On Generative Spoken Language Model For Speech Processing Tasks Kai-wei Chang, Wei-cheng Tseng, Shang-wen Li, Hung-yi Lee
- Alexatm 20B: Few-shot Learning Using A Large-scale Multilingual Seq2seq Model Saleh Soltan et al.
- BLIP: Bootstrapping Language-image Pre-training For Unified Vision-language Understanding And Generation Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Deepspeed-moe: Advancing Mixture-of-experts Inference And Training To Power Next-generation AI Scale Samyam Rajbhandari et al.
- Training Compute-optimal Large Language Models Jordan Hoffmann et al.
- Do Language Models Plagiarize? Jooyoung Lee, Thai Le, Jinghui Chen, Dongwon Lee
- OPT-IML: Scaling Language Model Instruction Meta Learning Through The Lens Of Generalization Srinivasan Iyer et al.
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Towards Trustworthy Autograding Of Short, Multi-lingual, Multi-type Answers Johannes Schneider, Robin Richner, Micha Riser
- Evolution Through Large Models Joel Lehman et al.
- Coditt5: Pretraining For Source Code And Natural Language Editing Jiyang Zhang, Sheena Panthaplackel, Pengyu Nie, Junyi Jessy Li, Milos Gligoric
- Vision-language Pre-training With Triple Contrastive Learning Jinyu Yang et al.
- News Summarization And Evaluation In The Era Of GPT-3 Tanya Goyal, Junyi Jessy Li, Greg Durrett
- Language Models (mostly) Know What They Know Saurav Kadavath et al.
- Generating Sequences By Learning To Self-correct Sean Welleck et al.
- Controllable Natural Language Generation With Contrastive Prefixes Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen
- Incorporating Domain Knowledge Through Task Augmentation For Front-end Javascript Code Generation Sijie Shen et al.
- Unified-io: A Unified Model For Vision, Language, And Multi-modal Tasks Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
- Lilt: A Simple Yet Effective Language-independent Layout Transformer For Structured Document Understanding Jiapeng Wang, Lianwen Jin, Kai Ding
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Knowledge Prompting In Pre-trained Language Model For Natural Language Understanding Jianing Wang et al.
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- Improving The Domain Adaptation Of Retrieval Augmented Generation (RAG) Models For Open Domain Question Answering Shamane Siriwardhana et al.
- Coca: Contrastive Captioners Are Image-text Foundation Models Jiahui Yu et al.
- Large Language Models Can Self-improve Jiaxin Huang et al.
- Adapting Pre-trained Language Models To African Languages Via Multilingual Adaptive Fine-tuning Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow
- Revisiting The "video" In Video-language Understanding Shyamal Buch et al.
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- BERTIN: Efficient Pre-training Of A Spanish Language Model Using Perplexity Sampling Javier De La Rosa et al.
- Benchmarking Large Language Models For Automated Verilog RTL Code Generation Shailja Thakur et al.
- Language Models As Agent Models Jacob Andreas
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- Neural Theory-of-mind? On The Limits Of Social Intelligence In Large Lms Maarten Sap, Ronan Lebras, Daniel Fried, Yejin Choi
- RARR: Researching And Revising What Language Models Say, Using Language Models Luyu Gao et al.
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Vit5: Pretrained Text-to-text Transformer For Vietnamese Language Generation Long Phan, Hieu Tran, Hieu Nguyen, Trieu H. Trinh
- Instructionner: A Multi-task Instruction-based Generative Framework For Few-shot NER Liwen Wang et al.
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Real Or Fake Text?: Investigating Human Ability To Detect Boundaries Between Human-written And Machine-generated Text Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Sherry Shi, Chris Callison-burch
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- Efficient Few-shot Learning Without Prompts Lewis Tunstall et al.
- Phenaki: Variable Length Video Generation From Open Domain Textual Description Ruben Villegas et al.
- Personalized Prompt Learning For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Data Distributional Properties Drive Emergent In-context Learning In Transformers Stephanie C. Y. Chan et al.
- The Goldilocks Of Pragmatic Understanding: Fine-tuning Strategy Matters For Implicature Resolution By Llms Laura Ruis et al.
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- Memorization Without Overfitting: Analyzing The Training Dynamics Of Large Language Models Kushal Tirumala, Aram H. Markosyan, Luke Zettlemoyer, Armen Aghajanyan
- Blenderbot 3: A Deployed Conversational Agent That Continually Learns To Responsibly Engage Kurt Shuster et al.
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- Is Reinforcement Learning (not) For Natural Language Processing: Benchmarks, Baselines, And Building Blocks For Natural Language Policy Optimization Rajkumar Ramamurthy et al.
- Visual Programming: Compositional Visual Reasoning Without Training Tanmay Gupta, Aniruddha Kembhavi
- Efficient Long-text Understanding With Short-text Models Maor Ivgi, Uri Shaham, Jonathan Berant
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- Visual Prompt Tuning Menglin Jia et al.
- When And Why Vision-language Models Behave Like Bags-of-words, And What To Do About It? Mert Yuksekgonul, Federico Bianchi, Pratyusha Kalluri, Dan Jurafsky, James Zou
- Language Models With Image Descriptors Are Strong Few-shot Video-language Learners Zhenhailong Wang et al.
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Mixgen: A New Multi-modal Data Augmentation Xiaoshuai Hao et al.
- Pangu-coder: Program Synthesis With Function-level Language Modeling Fenia Christopoulou et al.
- Vindlu: A Recipe For Effective Video-and-language Pretraining Feng Cheng et al.
- Vision-language Intelligence: Tasks, Representation Learning, And Large Models Feng Li et al.
- SKILL: Structured Knowledge Infusion For Large Language Models Fedor Moiseev, Zhe Dong, Enrique Alfonseca, Martin Jaggi
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Deplot: One-shot Visual Language Reasoning By Plot-to-table Translation Fangyu Liu et al.
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Legal Prompting: Teaching A Language Model To Think Like A Lawyer Fangyi Yu, Lee Quartey, Frank Schilder
- Long Time No See! Open-domain Conversation With Long-term Persona Memory Xinchao Xu et al.
- Red Teaming Language Models With Language Models Ethan Perez et al.
- Matcha: Enhancing Visual Language Pretraining With Math Reasoning And Chart Derendering Fangyu Liu et al.
- Codegen: An Open Large Language Model For Code With Multi-turn Program Synthesis Erik Nijkamp et al.
- Capturing Failures Of Large Language Models Via Human Cognitive Biases Erik Jones, Jacob Steinhardt
- Prompt Distribution Learning Yuning Lu, Jianzhuang Liu, Yonggang Zhang, Yajing Liu, Xinmei Tian
- Star: Bootstrapping Reasoning With Reasoning Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- Compilable Neural Code Generation With Compiler Feedback Xin Wang et al.
- IGLUE: A Benchmark For Transfer Learning Across Modalities, Tasks, And Languages Emanuele Bugliarello et al.
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- CREPE: Can Vision-language Foundation Models Reason Compositionally? Zixian Ma et al.
- Quark: Controllable Text Generation With Reinforced Unlearning Ximing Lu et al.
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- Legal Prompt Engineering For Multilingual Legal Judgement Prediction Dietrich Trautmann, Alina Petrova, Frank Schilder
- Altclip: Altering The Language Encoder In CLIP For Extended Language Capabilities Zhongzhi Chen et al.
- Improving Passage Retrieval With Zero-shot Question Generation Devendra Singh Sachan et al.
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Visual-language Navigation Pretraining Via Prompt-based Environmental Self-exploration Xiwen Liang, Fengda Zhu, Lingling Li, Hang Xu, Xiaodan Liang
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- The Stack: 3 TB Of Permissively Licensed Source Code Denis Kocetkov et al.
- Least-to-most Prompting Enables Complex Reasoning In Large Language Models Denny Zhou et al.
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- Protoclip: Prototypical Contrastive Language Image Pretraining Delong Chen et al.
- Block-recurrent Transformers Delesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur
- Factpegasus: Factuality-aware Pre-training And Fine-tuning For Abstractive Summarization David Wan, Mohit Bansal
- Neural Pipeline For Zero-shot Data-to-text Generation Zdeněk Kasner, Ondřej Dušek
- Adaprompt: Adaptive Model Training For Prompt-based NLP Yulong Chen et al.
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- Democratizing Contrastive Language-image Pre-training: A CLIP Benchmark Of Data, Model, And Supervision Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao
- Discovering Latent Knowledge In Language Models Without Supervision Collin Burns, Haotian Ye, Dan Klein, Jacob Steinhardt
- Learning Video Representations From Large Language Models Yue Zhao, Ishan Misra, Philipp Krähenbühl, Rohit Girdhar
- Augesc: Dialogue Augmentation With Large Language Models For Emotional Support Conversation Chujie Zheng, Sahand Sabour, Jiaxin Wen, Zheng Zhang, Minlie Huang
- Competition-level Code Generation With Alphacode Yujia Li et al.
- LAION-5B: An Open Large-scale Dataset For Training Next Generation Image-text Models Christoph Schuhmann et al.
- Promda: Prompt-based Data Augmentation For Low-resource NLU Tasks Yufei Wang et al.
- Prompt For Extraction? PAIE: Prompting Argument Interaction For Event Argument Extraction Yubo Ma et al.
- Fast Inference From Transformers Via Speculative Decoding Yaniv Leviathan, Matan Kalman, Yossi Matias
- Cont: Contrastive Neural Text Generation Chenxin An et al.
- Linearly Mapping From Image To Text Space Jack Merullo, Louis Castricato, Carsten Eickhoff, Ellie Pavlick
- Noisytune: A Little Noise Can Help You Finetune Pretrained Language Models Better Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang, Xing Xie
- A Unified End-to-end Retriever-reader Framework For Knowledge-based VQA Yangyang Guo et al.
- Code4struct: Code Generation For Few-shot Event Structure Prediction Xingyao Wang, Sha Li, Heng Ji
- Scaling Language-image Pre-training Via Masking Yanghao Li, Haoqi Fan, Ronghang Hu, Christoph Feichtenhofer, Kaiming He
- Texts As Images In Prompt Tuning For Multi-label Image Recognition Zixian Guo et al.
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- Calibrating Sequence Likelihood Improves Conditional Language Generation Yao Zhao et al.
- In-context Learning And Induction Heads Catherine Olsson et al.
- Llm-planner: Few-shot Grounded Planning For Embodied Agents With Large Language Models Chan Hee Song et al.
- A Survey On Model Compression And Acceleration For Pretrained Language Models Canwen Xu, Julian Mcauley
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- IDPG: An Instance-dependent Prompt Generation Method Zhuofeng Wu et al.
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Enabling Conversational Interaction With Mobile UI Using Large Language Models Bryan Wang, Gang Li, Yang Li
- Why Does Surprisal From Larger Transformer-based Language Models Provide A Poorer Fit To Human Reading Times? Byung-doh Oh, William Schuler
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- Multimodal Knowledge Alignment With Reinforcement Learning Youngjae Yu et al.
- LERT: A Linguistically-motivated Pre-trained Language Model Yiming Cui, Wanxiang Che, Shijin Wang, Ting Liu
- Super-naturalinstructions: Generalization Via Declarative Instructions On 1600+ NLP Tasks Yizhong Wang et al.
- Impact Of Pretraining Term Frequencies On Few-shot Reasoning Yasaman Razeghi, Robert L. Iv Logan, Matt Gardner, Sameer Singh
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- St-moe: Designing Stable And Transferable Sparse Expert Models Barret Zoph et al.
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Generative Language Models For Paragraph-level Question Generation Asahi Ushio, Fernando Alva-manchego, Jose Camacho-collados
- T-NER: An All-round Python Library For Transformer-based Named Entity Recognition Asahi Ushio, Jose Camacho-collados
- Dialog Inpainting: Turning Documents Into Dialogs Zhuyun Dai et al.
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Selection-inference: Exploiting Large Language Models For Interpretable Logical Reasoning Antonia Creswell, Murray Shanahan, Irina Higgins
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- Plug-and-play VQA: Zero-shot VQA By Conjoining Large Pretrained Models With Zero Training Anthony Meng Huat Tiong, Junnan Li, Boyang Li, Silvio Savarese, Steven C. H. Hoi
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Compositional Semantic Parsing With Large Language Models Andrew Drozdov et al.
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Language Model Compression With Weighted Low-rank Factorization Yen-chang Hsu et al.
- Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier
- Memory-assisted Prompt Editing To Improve GPT-3 After Deployment Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- A Model-agnostic Data Manipulation Method For Persona-based Dialogue Generation Yu Cao, Wei Bi, Meng Fang, Shuming Shi, Dacheng Tao
- Standing On The Shoulders Of Giant Frozen Language Models Yoav Levine et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- WANLI: Worker And AI Collaboration For Natural Language Inference Dataset Creation Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- REVEAL: Retrieval-augmented Visual-language Pre-training With Multi-source Multimodal Knowledge Memory Ziniu Hu et al.
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- ATTEMPT: Parameter-efficient Multi-task Tuning Via Attentional Mixtures Of Soft Prompts Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- LASP: Text-to-text Optimization For Language-aware Soft Prompting Of Vision & Language Models Adrian Bulat, Georgios Tzimiropoulos
- A New Path: Scaling Vision-and-language Navigation With Synthetic Instructions And Imitation Learning Aishwarya Kamath et al.
- Scaling Up Models And Data With \(\texttt{t5x}\) And \(\texttt{seqio}\) Adam Roberts et al.
- TALM: Tool Augmented Language Models Aaron Parisi, Yao Zhao, Noah Fiedel
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- Dynamic Prompt Learning Via Policy Gradient For Semi-structured Mathematical Reasoning Pan Lu et al.
- Unnatural Instructions: Tuning Language Models With (almost) No Human Labor Or Honovich, Thomas Scialom, Omer Levy, Timo Schick
- Large Language Models And The Reverse Turing Test Terrence Sejnowski
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- Measuring And Narrowing The Compositionality Gap In Language Models Ofir Press et al.
- On The Origin Of Hallucinations In Conversational Models: Is It The Datasets Or The Models? Nouha Dziri, Sivan Milton, Mo Yu, Osmar Zaiane, Siva Reddy
- Grounding Language With Visual Affordances Over Unstructured Data Oier Mees, Jessica Borja-diaz, Wolfram Burgard
- Emergent Analogical Reasoning In Large Language Models Taylor Webb, Keith J. Holyoak, Hongjing Lu
- Faithdial: A Faithful Benchmark For Information-seeking Dialogue Nouha Dziri et al.
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- No Language Left Behind: Scaling Human-centered Machine Translation Nllb Team et al.
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- Large Language Models Struggle To Learn Long-tail Knowledge Nikhil Kandpal, Haikang Deng, Adam Roberts, Eric Wallace, Colin Raffel
- Learning To Compose Soft Prompts For Compositional Zero-shot Learning Nihal V. Nayak, Peilin Yu, Stephen H. Bach
- Quantifying Memorization Across Neural Language Models Nicholas Carlini et al.
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Vl-checklist: Evaluating Pre-trained Vision-language Models With Objects, Attributes And Relations Tiancheng Zhao et al.
- Crosslingual Generalization Through Multitask Finetuning Niklas Muennighoff et al.
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Fine-tuned Language Models Are Continual Learners Thomas Scialom, Tuhin Chakrabarty, Smaranda Muresan
- Efficient Training Of Language Models To Fill In The Middle Mohammad Bavarian et al.
- Dylora: Parameter Efficient Tuning Of Pre-trained Models Using Dynamic Search-free Low-rank Adaptation Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, Ali Ghodsi
- KALA: Knowledge-augmented Language Model Adaptation Minki Kang, Jinheon Baek, Sung Ju Hwang
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- An Empirical Study Of End-to-end Video-language Transformers With Masked Visual Modeling Tsu-jui Fu et al.
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Training And Evaluating A Jupyter Notebook Data Science Assistant Shubham Chandel, Colin B. Clement, Guillermo Serrato, Neel Sundaresan
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Make-a-video: Text-to-video Generation Without Text-video Data Uriel Singer et al.
- Help Me Write A Poem: Instruction Tuning As A Vehicle For Collaborative Poetry Writing Tuhin Chakrabarty, Vishakh Padmakumar, He He
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- PINTO: Faithful Language Reasoning Using Prompt-generated Rationales Peifeng Wang, Aaron Chan, Filip Ilievski, Muhao Chen, Xiang Ren
- Can Llms Express Their Uncertainty? An Empirical Evaluation Of Confidence Elicitation In Llms Miao Xiong et al.
- From Image To Language: A Critical Analysis Of Visual Question Answering (VQA) Approaches, Challenges, And Opportunities Md Farhan Ishmam, Md Sakib Hossain Shovon, M. F. Mridha, Nilanjan Dey
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- Few-shot Fine-tuning Vs. In-context Learning: A Fair Comparison And Evaluation Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- Flexkbqa: A Flexible Llm-powered Framework For Few-shot Knowledge Base Question Answering Zhenyu Li et al.
- The Reversal Curse: Llms Trained On "A Is B" Fail To Learn "B Is A" Lukas Berglund et al.
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Taiyi: A Bilingual Fine-tuned Large Language Model For Diverse Biomedical Tasks Ling Luo et al.
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- Reasoning On Graphs: Faithful And Interpretable Large Language Model Reasoning Linhao Luo, Yuan-fang Li, Gholamreza Haffari, Shirui Pan
- Improving CLIP Training With Language Rewrites Lijie Fan, Dilip Krishnan, Phillip Isola, Dina Katabi, Yonglong Tian
- Automatically Correcting Large Language Models: Surveying The Landscape Of Diverse Self-correction Strategies Liangming Pan et al.
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Improving Text Embeddings With Large Language Models Liang Wang et al.
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- Query2doc: Query Expansion With Large Language Models Liang Wang, Nan Yang, Furu Wei
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Zero-shot Next-item Recommendation Using Large Pretrained Language Models Lei Wang, Ee-peng Lim
- Dissociating Language And Thought In Large Language Models Kyle Mahowald et al.
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- Just Tell Me: Prompt Engineering In Business Process Management Kiran Busch, Alexander Rochlitzer, Diana Sola, Henrik Leopold
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- Large Language Models And Simple, Stupid Bugs Kevin Jesse, Toufique Ahmed, Premkumar T. Devanbu, Emily Morgan
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Speak, Memory: An Archaeology Of Books Known To Chatgpt/gpt-4 Kent K. Chang, Mackenzie Cramer, Sandeep Soni, David Bamman
- Just Ask For Calibration: Strategies For Eliciting Calibrated Confidence Scores From Language Models Fine-tuned With Human Feedback Katherine Tian et al.
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Tinyclip: CLIP Distillation Via Affinity Mimicking And Weight Inheritance Kan Stephen Wu et al.
- Chipgpt: How Far Are We From Natural Language Hardware Design Kaiyan Chang et al.
- Aligning Instruction Tasks Unlocks Large Language Models As Zero-shot Relation Extractors Kai Zhang, Bernal Jiménez Gutiérrez, Yu Su
- Full Parameter Fine-tuning For Large Language Models With Limited Resources Kai Lv et al.
- ALIP: Adaptive Language-image Pre-training With Synthetic Caption Kaicheng Yang et al.
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- Evaluation And Analysis Of Hallucination In Large Vision-language Models Junyang Wang et al.
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, Chun Zuo
- Chatcounselor: A Large Language Models For Mental Health Support June M. Liu et al.
- Backdooring Instruction-tuned Large Language Models With Virtual Prompt Injection Jun Yan et al.
- Breaking The Silence: The Threats Of Using Llms In Software Engineering June Sallou, Thomas Durieux, Annibale Panichella
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- GQA: Training Generalized Multi-query Transformer Models From Multi-head Checkpoints Joshua Ainslie et al.
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- Increasing Diversity While Maintaining Accuracy: Text Data Generation With Large Language Models And Human Interventions John Joon Young Chung, Ece Kamar, Saleema Amershi
- Exploring The Benefits Of Training Expert Language Models Over Instruction Tuning Joel Jang et al.
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Grounding Language Models To Images For Multimodal Inputs And Outputs Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried
- When Large Language Models Meet Personalization: Perspectives Of Challenges And Opportunities Jin Chen et al.
- Prompt-and-align: Prompt-based Social Alignment For Few-shot Fake News Detection Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Unified-io 2: Scaling Autoregressive Multimodal Models With Vision, Language, Audio, And Action Jiasen Lu et al.
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- Imagebind-llm: Multi-modality Instruction Tuning Jiaming Han et al.
- On Decoder-only Architecture For Speech-to-text And Large Language Model Integration Jian Wu et al.
- Llm-grounder: Open-vocabulary 3D Visual Grounding With Large Language Model As An Agent Jianing Yang et al.
- ICL-D3IE: In-context Learning With Diverse Demonstrations Updating For Document Information Extraction Jiabang He et al.
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- Unlearn What You Want To Forget: Efficient Unlearning For Llms Jiaao Chen, Diyi Yang
- Ureader: Universal Ocr-free Visually-situated Language Understanding With Multimodal Large Language Model Jiabo Ye et al.
- VILA: On Pre-training For Visual Language Models Ji Lin et al.
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Larger Language Models Do In-context Learning Differently Jerry Wei et al.
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- Physically Grounded Vision-language Models For Robotic Manipulation Jensen Gao et al.
- Auditing Large Language Models: A Three-layered Approach Jakob Mökander, Jonas Schuett, Hannah Rose Kirk, Luciano Floridi
- Fake News In Sheep's Clothing: Robust Fake News Detection Against Llm-empowered Style Attacks Jiaying Wu, Jiafeng Guo, Bryan Hooi
- Evaluation Of Chatgpt On Biomedical Tasks: A Zero-shot Comparison With Fine-tuned Generative Transformers Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- A Comprehensive Evaluation Of Large Language Models On Benchmark Biomedical Text Processing Tasks Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- The Curse Of Recursion: Training On Generated Data Makes Models Forget Ilia Shumailov et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- A Comprehensive Overview Of Large Language Models Humza Naveed et al.
- The Bigscience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Hugo Laurençon et al.
- Llama 2: Open Foundation And Fine-tuned Chat Models Hugo Touvron et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Doctorglm: Fine-tuning Your Chinese Doctor Is Not A Herculean Task Honglin Xiong et al.
- Bioinstruct: Instruction Tuning Of Large Language Models For Biomedical Natural Language Processing Hieu Tran, Zhichao Yang, Zonghai Yao, Hong Yu
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- Self-chained Image-language Model For Video Localization And Question Answering Shoubin Yu, Jaemin Cho, Prateek Yadav, Mohit Bansal
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Large Language Model Augmented Narrative Driven Recommendations Sheshera Mysore, Andrew Mccallum, Hamed Zamani
- Instruction Tuning For Large Language Models: A Survey Shengyu Zhang et al.
- Mixture-of-experts Meets Instruction Tuning:a Winning Combination For Large Language Models Sheng Shen et al.
- Scaling Vision-language Models With Sparse Mixture Of Experts Sheng Shen et al.
- The Flan Collection: Designing Data And Methods For Effective Instruction Tuning Shayne Longpre et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- The Moral Authority Of Chatgpt Sebastian Krügel, Andreas Ostermaier, Matthias Uhl
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- A Comparative Study Of Open-source Large Language Models, GPT-4 And Claude 2: Multiple-choice Test Taking In Nephrology Sean Wu et al.
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- Does Synthetic Data Generation Of Llms Help Clinical Text Mining? Ruixiang Tang, Xiaotian Han, Xiaoqian Jiang, Xia Hu
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- In-context Learning Creates Task Vectors Roee Hendel, Mor Geva, Amir Globerson
- Retrieval-augmented Image Captioning Rita Ramos, Desmond Elliott, Bruno Martins
- Beyond Memorization: Violating Privacy Via Inference With Large Language Models Robin Staab, Mark Vero, Mislav Balunović, Martin Vechev
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- A Universal Question-answering Platform For Knowledge Graphs Reham Omar, Ishika Dhall, Panos Kalnis, Essam Mansour
- Automatic Prompt Optimization With "gradient Descent" And Beam Search Reid Pryzant et al.
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Pro-cap: Leveraging A Frozen Vision-language Model For Hateful Meme Detection Rui Cao et al.
- Scalable Educational Question Generation With Pre-trained Language Models Sahan Bulathwela, Hamze Muse, Emine Yilmaz
- Sabi\'a: Portuguese Large Language Models Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira
- Lawyer Llama Technical Report Quzhe Huang et al.
- Direct Preference Optimization: Your Language Model Is Secretly A Reward Model Rafael Rafailov et al.
- Mplug-owl: Modularization Empowers Large Language Models With Multimodality Qinghao Ye et al.
- Adalora: Adaptive Budget Allocation For Parameter-efficient Fine-tuning Qingru Zhang et al.
- ONCE: Boosting Content-based Recommendation With Both Open- And Closed-source Large Language Models Qijiong Liu, Nuo Chen, Tetsuya Sakai, Xiao-ming Wu
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Medcpt: Contrastive Pre-trained Transformers With Large-scale Pubmed Search Logs For Zero-shot Biomedical Information Retrieval Qiao Jin et al.
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- Pre-train, Prompt And Recommendation: A Comprehensive Survey Of Language Modelling Paradigm Adaptations In Recommender Systems Peng Liu, Lemei Zhang, Jon Atle Gulla
- Audiopalm: A Large Language Model That Can Speak And Listen Paul K. Rubenstein et al.
- Internlm-xcomposer: A Vision-language Large Model For Advanced Text-image Comprehension And Composition Pan Zhang et al.
- In-context Retrieval-augmented Language Models Ori Ram et al.
- GPT-4 Technical Report Openai et al.
- Fine-tuning Or Retrieval? Comparing Knowledge Injection In Llms Oded Ovadia, Menachem Brief, Moshik Mishaeli, Oren Elisha
- Fusecap: Leveraging Large Language Models For Enriched Fused Image Captions Noam Rotstein, David Bensaid, Shaked Brody, Roy Ganz, Ron Kimmel
- Large Language Models Are Built-in Autoregressive Search Engines Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- Sparse Low-rank Adaptation Of Pre-trained Language Models Ning Ding et al.
- Bridging The Gap: A Survey On Integrating (human) Feedback For Natural Language Generation Patrick Fernandes et al.
- CAT-LM: Training Language Models On Aligned Code And Tests Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn
- Sources Of Hallucination By Large Language Models On Inference Tasks Nick Mckenna et al.
- Jais And Jais-chat: Arabic-centric Foundation And Instruction-tuned Open Generative Large Language Models Neha Sengupta et al.
- Self-regulating Prompts: Foundational Model Adaptation Without Forgetting Muhammad Uzair Khattak et al.
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- Do Llms Understand Social Knowledge? Evaluating The Sociability Of Large Language Models With Socket Benchmark Minje Choi, Jiaxin Pei, Sagar Kumar, Chang Shu, David Jurgens
- Api-bank: A Comprehensive Benchmark For Tool-augmented Llms Minghao Li et al.
- A Simple And Effective Pruning Approach For Large Language Models Mingjie Sun, Zhuang Liu, Anna Bair, J. Zico Kolter
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Scalable Extraction Of Training Data From (production) Language Models Milad Nasr et al.
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Hyena Hierarchy: Towards Larger Convolutional Language Models Michael Poli et al.
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- Toolformer: Language Models Can Teach Themselves To Use Tools Timo Schick et al.
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Spqr: A Sparse-quantized Representation For Near-lossless LLM Weight Compression Tim Dettmers et al.
- Medalpaca -- An Open-source Collection Of Medical Conversational AI Models And Training Data Tianyu Han et al.
- Generalized Planning In PDDL Domains With Pretrained Large Language Models Tom Silver et al.
- Few-shot In-context Learning For Knowledge Base Question Answering Tianle Li et al.
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Open-ended Medical Visual Question Answering Through Prefix Tuning Of Language Models Tom Van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring
- Large Language Models As General Pattern Machines Suvir Mirchandani et al.
- Orca: Progressive Learning From Complex Explanation Traces Of GPT-4 Subhabrata Mukherjee et al.
- Pretraining Language Models With Human Preferences Tomasz Korbak et al.
- Emergent And Predictable Memorization In Large Language Models Stella Biderman et al.
- Pythia: A Suite For Analyzing Large Language Models Across Training And Scaling Stella Biderman et al.
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- Mitigating Object Hallucinations In Large Vision-language Models Through Visual Contrastive Decoding Sicong Leng et al.
- Mariogpt: Open-ended Text2level Generation Through Large Language Models Shyam Sudhakaran et al.
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- From Words To Watts: Benchmarking The Energy Costs Of Large Language Model Inference Siddharth Samsi et al.
- Do Llms Understand User Preferences? Evaluating Llms On User Rating Prediction Wang-cheng Kang et al.
- Inpars-v2: Large Language Models As Efficient Dataset Generators For Information Retrieval Vitor Jeronymo et al.
- Chatgpt Beyond English: Towards A Comprehensive Evaluation Of Large Language Models In Multilingual Learning Viet Dac Lai et al.
- Scaling Down To Scale Up: A Guide To Parameter-efficient Fine-tuning Vladislav Lialin, Vijeta Deshpande, Xiaowei Yao, Anna Rumshisky
- Evaluating Correctness And Faithfulness Of Instruction-following Models For Question Answering Vaibhav Adlakha, Parishad Behnamghader, Xing Han Lu, Nicholas Meade, Siva Reddy
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Nemo Guardrails: A Toolkit For Controllable And Safe LLM Applications With Programmable Rails Traian Rebedea, Razvan Dinu, Makesh Sreedhar, Christopher Parisien, Jonathan Cohen
- Trusting Your Evidence: Hallucinate Less With Context-aware Decoding Weijia Shi et al.
- Promptcblue: A Chinese Prompt Tuning Benchmark For The Medical Domain Wei Zhu, Xiaoling Wang, Huanran Zheng, Mosha Chen, Buzhou Tang
- Alpha-clip: A CLIP Model Focusing On Wherever You Want Zeyi Sun et al.
- A Preliminary Evaluation Of Chatgpt For Zero-shot Dialogue Understanding Wenbo Pan, Qiguang Chen, Xiao Xu, Wanxiang Che, Libo Qin
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Large Language Models As Zero-shot Conversational Recommenders Zhankui He et al.
- R2gengpt: Radiology Report Generation With Frozen Llms Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine Harsha Nori et al.
- Improved Baselines With Visual Instruction Tuning Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Extractive Summarization Via Chatgpt For Faithful Summary Generation Haopeng Zhang, Xiao Liu, Jiawei Zhang
- Visual-language Prompt Tuning With Knowledge-guided Context Optimization Hantao Yao, Rui Zhang, Changsheng Xu
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Video-llama: An Instruction-tuned Audio-visual Language Model For Video Understanding Hang Zhang, Xin Li, Lidong Bing
- Mplug-2: A Modularized Multi-modal Foundation Model Across Text, Image And Video Haiyang Xu et al.
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Gender Bias And Stereotypes In Large Language Models Hadas Kotek, Rikker Dockum, David Q. Sun
- Auggpt: Leveraging Chatgpt For Text Data Augmentation Haixing Dai et al.
- The Refinedweb Dataset For Falcon LLM: Outperforming Curated Corpora With Web Data, And Web Data Only Guilherme Penedo et al.
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Dr Chatgpt, Tell Me What I Want To Hear: How Prompt Knowledge Impacts Health Answer Correctness Guido Zuccon, Bevan Koopman
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Level Generation Through Large Language Models Graham Todd, Sam Earle, Muhammad Umair Nasir, Michael Cerny Green, Julian Togelius
- Performance Of The Pre-trained Large Language Model GPT-4 On Automated Short Answer Grading Gerd Kortemeyer
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Lawbench: Benchmarking Legal Knowledge Of Large Language Models Zhiwei Fei et al.
- Cheap And Quick: Efficient Vision-language Instruction Tuning For Large Language Models Gen Luo et al.
- Navgpt: Explicit Reasoning In Vision-and-language Navigation With Large Language Models Gengze Zhou, Yicong Hong, Qi Wu
- Synthetic Data Generation With Large Language Models For Text Classification: Potential And Limitations Zhuoyan Li, Hangxiao Zhu, Zhuoran Lu, Ming Yin
- Gemini: A Family Of Highly Capable Multimodal Models Gemini Team et al.
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Do We Still Need Clinical Language Models? Eric Lehman et al.
- Towards Efficient Fine-tuning Of Pre-trained Code Models: An Experimental Study And Beyond Ensheng Shi et al.
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Lasuie: Unifying Information Extraction With Latent Adaptive Structure-aware Generative Language Model Hao Fei et al.
- Aligning Large Multimodal Models With Factually Augmented RLHF Zhiqing Sun et al.
- Llm-adapters: An Adapter Family For Parameter-efficient Fine-tuning Of Large Language Models Zhiqiang Hu et al.
- Vipergpt: Visual Inference Via Python Execution For Reasoning Dídac Surís, Sachit Menon, Carl Vondrick
- The Falcon Series Of Open Language Models Ebtesam Almazrouei et al.
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- Speechgpt: Empowering Large Language Models With Intrinsic Cross-modal Conversational Abilities Dong Zhang et al.
- MELTR: Meta Loss Transformer For Learning To Fine-tune Video Foundation Models Dohwan Ko et al.
- The Vector Grounding Problem Dimitri Coelho Mollo, Raphaël Millière
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- One Adapter For All Programming Languages? Adapter Tuning For Code Search And Summarization Deze Wang et al.
- Evaluating GPT-3.5 And GPT-4 Models On Brazilian University Admission Exams Desnes Nunes, Ricardo Primi, Ramon Pires, Roberto Lotufo, Rodrigo Nogueira
- Fine-tuning Chatgpt For Automatic Scoring Ehsan Latif, Xiaoming Zhai
- The Capacity For Moral Self-correction In Large Language Models Deep Ganguli et al.
- Text-to-sql Empowered By Large Language Models: A Benchmark Evaluation Dawei Gao et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Palm-e: An Embodied Multimodal Language Model Danny Driess et al.
- SOLAR 10.7B: Scaling Large Language Models With Simple Yet Effective Depth Up-scaling Dahyun Kim et al.
- Improving Accuracy Of GPT-3/4 Results On Biomedical Data Using A Retrieval-augmented Language Model David Soong et al.
- Llava-med: Training A Large Language-and-vision Assistant For Biomedicine In One Day Chunyuan Li et al.
- LIMA: Less Is More For Alignment Chunting Zhou et al.
- Multimodal Foundation Models: From Specialists To General-purpose Assistants Chunyuan Li et al.
- Debiasing Vision-language Models Via Biased Prompts Ching-yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka
- A Study On The Implementation Of Generative AI Services Using An Enterprise Data-based LLM Application Architecture Cheonsu Jeong
- Llm-powered Data Augmentation For Enhanced Cross-lingual Performance Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Generative Speech Recognition Error Correction With Large Language Models And Task-activating Prompting Chao-han Huck Yang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- K2: A Foundation Language Model For Geoscience Knowledge Understanding And Utilization Cheng Deng et al.
- A Confederacy Of Models: A Comprehensive Evaluation Of Llms On Creative Writing Carlos Gómez-rodríguez, Paul Williams
- Wizardlm: Empowering Large Language Models To Follow Complex Instructions Can Xu et al.
- Compositional Chain-of-thought Prompting For Large Multimodal Models Chancharik Mitra, Brandon Huang, Trevor Darrell, Roei Herzig
- Pmc-llama: Towards Building Open-source Language Models For Medicine Chaoyi Wu et al.
- Reinforced Self-training (rest) For Language Modeling Caglar Gulcehre et al.
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- Prompting Or Fine-tuning? A Comparative Study Of Large Language Models For Taxonomy Construction Boqi Chen, Fandi Yi, Dániel Varró
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- Vtimellm: Empower LLM To Grasp Video Moments Bin Huang, Xin Wang, Hong Chen, Zihan Song, Wenwu Zhu
- Prompting Large Language Model For Machine Translation: A Case Study Biao Zhang, Barry Haddow, Alexandra Birch
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Can Large Language Models Transform Computational Social Science? Caleb Ziems et al.
- Instruction Tuning With GPT-4 Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao
- Coupling Large Language Models With Logic Programming For Robust And General Reasoning From Text Zhun Yang, Adam Ishay, Joohyung Lee
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- Orca 2: Teaching Small Language Models How To Reason Arindam Mitra et al.
- RT-2: Vision-language-action Models Transfer Web Knowledge To Robotic Control Anthony Brohan et al.
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models Anas Awadalla et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- The Impact Of Positional Encoding On Length Generalization In Transformers Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
- Self-refine: Iterative Refinement With Self-feedback Aman Madaan et al.
- Lamp: When Large Language Models Meet Personalization Alireza Salemi, Sheshera Mysore, Michael Bendersky, Hamed Zamani
- Model Tuning Or Prompt Tuning? A Study Of Large Language Models For Clinical Concept And Relation Extraction Cheng Peng et al.
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Clipsyntel: CLIP And LLM Synergy For Multimodal Question Summarization In Healthcare Akash Ghosh et al.
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- Baichuan 2: Open Large-scale Language Models Aiyuan Yang et al.
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- Prompting Large Language Models With Speech Recognition Abilities Yassir Fathullah et al.
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- On Learning To Summarize With Large Language Models As References Yixin Liu et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- NL2TL: Transforming Natural Languages To Temporal Logics Using Large Language Models Yongchao Chen, Rujul Gandhi, Yang Zhang, Chuchu Fan
- Assessing Cross-cultural Alignment Between Chatgpt And Human Societies: An Empirical Study Yong Cao et al.
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- Pandagpt: One Model To Instruction-follow Them All Yixuan Su et al.
- Key-locked Rank One Editing For Text-to-image Personalization Yoad Tewel, Rinon Gal, Gal Chechik, Yuval Atzmon
- When Prompt-based Incremental Learning Does Not Meet Strong Pretraining Yu-ming Tang, Yi-xing Peng, Wei-shi Zheng
- Low-rank Adaptation Of Large Language Model Rescoring For Parameter-efficient Speech Recognition Yu Yu et al.
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Bubogpt: Enabling Visual Grounding In Multi-modal Llms Yang Zhao et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Emotional Intelligence Of Large Language Models Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Liu Jia
- Fine-tuning Llama For Multi-stage Text Retrieval Xueguang Ma, Liang Wang, Nan Yang, Furu Wei, Jimmy Lin
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- Llm-pruner: On The Structural Pruning Of Large Language Models Xinyin Ma, Gongfan Fang, Xinchao Wang
- LISA: Reasoning Segmentation Via Large Language Model Xin Lai et al.
- Xuanyuan 2.0: A Large Chinese Financial Chat Model With Hundreds Of Billions Parameters Xuanyu Zhang, Qing Yang, Dongliang Xu
- HPC-GPT: Integrating Large Language Model For High-performance Computing Xianzhong Ding et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Don't Trust Chatgpt When Your Question Is Not In English: A Study Of Multilingual Abilities And Types Of Llms Xiang Zhang, Senyu Li, Bradley Hauer, Ning Shi, Grzegorz Kondrak
- Medagents: Large Language Models As Collaborators For Zero-shot Medical Reasoning Xiangru Tang et al.
- The Unreasonable Effectiveness Of Few-shot Learning For Machine Translation Xavier Garcia et al.
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- Longbench: A Bilingual, Multitask Benchmark For Long Context Understanding Yushi Bai et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Chatdoctor: A Medical Chat Model Fine-tuned On A Large Language Model Meta-ai (llama) Using Medical Domain Knowledge Yunxiang Li et al.
- Towards Open-world Recommendation With Knowledge Augmentation From Large Language Models Yunjia Xi et al.
- An Empirical Study Of Catastrophic Forgetting In Large Language Models During Continual Fine-tuning Yun Luo et al.
- Character-llm: A Trainable Agent For Role-playing Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu
- Exploring The Impact Of Instruction Data Scaling On Large Language Models: An Empirical Study On Real-world Use Cases Yunjie Ji et al.
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- Aligning Large Language Models With Human: A Survey Yufei Wang et al.
- Toolqa: A Dataset For LLM Question Answering With External Tools Yuchen Zhuang, Yue Yu, Kuan Wang, Haotian Sun, Chao Zhang
- Preventing Zero-shot Transfer Degradation In Continual Learning Of Vision-language Models Zangwei Zheng et al.
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- MEDITRON-70B: Scaling Medical Pretraining For Large Language Models Zeming Chen et al.
- Fine-grained Human Feedback Gives Better Rewards For Language Model Training Zeqiu Wu et al.
- Billm: Pushing The Limit Of Post-training Quantization For Llms Wei Huang et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Continual Learning For Large Language Models: A Survey Tongtong Wu et al.
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- Adaptmllm: Fine-tuning Multilingual Language Models On Low-resource Languages With Integrated LLM Playgrounds Séamus Lankford, Haithem Afli, Andy Way
- Chatgpt As Research Scientist: Probing Gpt's Capabilities As A Research Librarian, Research Ethicist, Data Generator And Data Predictor Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R. Banaji
- The Era Of 1-bit Llms: All Large Language Models Are In 1.58 Bits Shuming Ma et al.
- Eyes Wide Shut? Exploring The Visual Shortcomings Of Multimodal Llms Shengbang Tong et al.
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- Me Llama: Foundation Large Language Models For Medical Applications Qianqian Xie et al.
- Jamba: A Hybrid Transformer-mamba Language Model Opher Lieber et al.
- Fine-tuned Language Models Generate Stable Inorganic Materials As Text Nate Gruver et al.
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- Language Models For Code Completion: A Practical Evaluation Maliheh Izadi et al.
- Data Is All You Need: Finetuning Llms For Chip Design Via An Automated Design-data Augmentation Framework Kaiyan Chang et al.
- The Dawn After The Dark: An Empirical Study On Factuality Hallucination In Large Language Models Junyi Li et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- ORPO: Monolithic Preference Optimization Without Reference Model Jiwoo Hong, Noah Lee, James Thorne
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Fine Tuning Vs. Retrieval Augmented Generation For Less Popular Knowledge Heydar Soudani, Evangelos Kanoulas, Faegheh Hasibi
- Closing The Gap Between Open-source And Commercial Large Language Models For Medical Evidence Summarization Gongbo Zhang et al.
- Code-aware Prompting: A Study Of Coverage Guided Test Generation In Regression Setting Using LLM Gabriel Ryan et al.
- Embedding Large Language Models Into Extended Reality: Opportunities And Challenges For Inclusion, Engagement, And Privacy Efe Bozkir et al.
- Olmo: Accelerating The Science Of Language Models Dirk Groeneveld et al.
- Deepseek-v2: A Strong, Economical, And Efficient Mixture-of-experts Language Model Deepseek-ai et al.
- The Revolution Of Multimodal Large Language Models: A Survey Davide Caffagni et al.
- Understanding Large-language Model (llm)-powered Human-robot Interaction Callie Y. Kim, Christine P. Lee, Bilge Mutlu
- MM1: Methods, Analysis & Insights From Multimodal LLM Pre-training Brandon Mckinzie et al.
- Moe-llava: Mixture Of Experts For Large Vision-language Models Bin Lin et al.
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- Why And When Llm-based Assistants Can Go Wrong: Investigating The Effectiveness Of Prompt-based Interactions For Software Help-seeking Anjali Khurana, Hari Subramonyam, Parmit K Chilana
- AI And Memory Wall Amir Gholami et al.
- Financial Statement Analysis With Large Language Models Alex Kim, Maximilian Muhn, Valeri Nikolaev
- Yi: Open Foundation Models By 01.AI 01. Ai et al.
- Does Fine-tuning Llms On New Knowledge Encourage Hallucinations? Zorik Gekhman et al.
- Harnessing Large Language Models For Text-rich Sequential Recommendation Zhi Zheng, Wenshuo Chao, Zhaopeng Qiu, Hengshu Zhu, Hui Xiong
- Large Language Models For Data Annotation And Synthesis: A Survey Zhen Tan et al.
- Llmparser: An Exploratory Study On Using Large Language Models For Log Parsing Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-hsun Chen, Shaowei Wang
- A Survey On Lora Of Large Language Models Yuren Mao et al.
- Large Language Model (LLM) AI Text Generation Detection Based On Transformer Deep Learning Algorithm Yuhong Mo, Hao Qin, Yushan Dong, Ziyi Zhu, Zhenglin Li
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
- Llamafactory: Unified Efficient Fine-tuning Of 100+ Language Models Yaowei Zheng et al.
- Datasets For Large Language Models: A Comprehensive Survey Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Mgte: Generalized Long-context Text Representation And Reranking Models For Multilingual Text Retrieval Xin Zhang et al.
- A Survey On RAG Meeting Llms: Towards Retrieval-augmented Large Language Models Wenqi Fan et al.
- Deepseek-r1: Incentivizing Reasoning Capability In Llms Via Reinforcement Learning Deepseek-ai et al.
- Findings Of The Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Alex Warstadt et al.
🏷 Transformer
- Topic Aware Neural Response Generation Chen Xing et al.
- Attention Strategies For Multi-source Sequence-to-sequence Learning Jindřich Libovický, Jindřich Helcl
- Attention Is All You Need Ashish Vaswani et al.
- Weighted Transformer Network For Machine Translation Karim Ahmed, Nitish Shirish Keskar, Richard Socher
- Gated-attention Architectures For Task-oriented Language Grounding Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov
- Frustratingly Short Attention Spans In Neural Language Modeling Michał Daniluk, Tim Rocktäschel, Johannes Welbl, Sebastian Riedel
- The Memad Submission To The WMT18 Multimodal Translation Task Stig-arne Grönroos et al.
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- Character-level Language Modeling With Deeper Self-attention Rami Al-rfou, Dokook Choe, Noah Constant, Mandy Guo, Llion Jones
- Commonsense For Generative Multi-hop Question Answering Tasks Lisa Bauer, Yicheng Wang, Mohit Bansal
- Sdnet: Contextualized Attention-based Deep Network For Conversational Question Answering Chenguang Zhu, Michael Zeng, Xuedong Huang
- An Affect-rich Neural Conversational Model With Biased Attention And Weighted Cross-entropy Loss Peixiang Zhong, Di Wang, Chunyan Miao
- Improving The Transformer Translation Model With Document-level Context Jiacheng Zhang et al.
- Training Tips For The Transformer Model Martin Popel, Ondřej Bojar
- Tensor2tensor For Neural Machine Translation Ashish Vaswani et al.
- "bilingual Expert" Can Find Translation Errors Kai Fan et al.
- Multi-cast Attention Networks For Retrieval-based Question Answering And Response Prediction Yi Tay, Luu Anh Tuan, Siu Cheung Hui
- BERT: Pre-training Of Deep Bidirectional Transformers For Language Understanding Jacob Devlin, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- Recosa: Detecting The Relevant Contexts With Self-attention For Multi-turn Dialogue Generation Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng
- Efficient Adaptation Of Pretrained Transformers For Abstractive Summarization Andrew Hoang, Antoine Bosselut, Asli Celikyilmaz, Yejin Choi
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- Bert4rec: Sequential Recommendation With Bidirectional Encoder Representations From Transformer Fei Sun et al.
- Entity-consistent End-to-end Task-oriented Dialogue System With KB Retriever Libo Qin et al.
- Attention Is Not Explanation Sarthak Jain, Byron C. Wallace
- Transformer-xl: Attentive Language Models Beyond A Fixed-length Context Zihang Dai et al.
- Revealing The Dark Secrets Of BERT Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky
- Align, Mask And Select: A Simple Method For Incorporating Commonsense Knowledge Into Language Representation Models Zhi-xiu Ye, Qian Chen, Wen Wang, Zhen-hua Ling
- Fully Quantized Transformer For Machine Translation Gabriele Prato, Ella Charlaix, Mehdi Rezagholizadeh
- Understanding The Behaviors Of BERT In Ranking Yifan Qiao, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Contextualized Sparse Representations For Real-time Open-domain Question Answering Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- Pretrained Language Models For Sequential Sentence Classification Arman Cohan, Iz Beltagy, Daniel King, Bhavana Dalvi, Daniel S. Weld
- PEGASUS: Pre-training With Extracted Gap-sentences For Abstractive Summarization Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- Neural Assistant: Joint Action Prediction, Response Generation, And Latent Knowledge Reasoning Arvind Neelakantan et al.
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- Camembert: A Tasty French Language Model Louis Martin et al.
- Non-autoregressive Transformer By Position Learning Yu Bao et al.
- Dialogue Transformers Vladimir Vlasov, Johannes E. M. Mosig, Alan Nichol
- Encode, Tag, Realize: High-precision Text Editing Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn
- Compressive Transformers For Long-range Sequence Modelling Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- A Tensorized Transformer For Language Modeling Xindian Ma et al.
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Pay Less Attention With Lightweight And Dynamic Convolutions Felix Wu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli
- Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection Guangxiang Zhao et al.
- Cloze-driven Pretraining Of Self-attention Networks Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- MUSE: Parallel Multi-scale Attention For Sequence To Sequence Learning Guangxiang Zhao, Xu Sun, Jingjing Xu, Zhiyuan Zhang, Liangchen Luo
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Single Headed Attention RNN: Stop Thinking With Your Head Stephen Merity
- Repurposing Entailment For Multi-hop Question Answering Tasks Harsh Trivedi, Heeyoung Kwon, Tushar Khot, Ashish Sabharwal, Niranjan Balasubramanian
- Language Modeling With Deep Transformers Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
- Modeling Graph Structure In Transformer For Better Amr-to-text Generation Jie Zhu et al.
- Modeling Recurrence For Transformer Jie Hao et al.
- Transfertransfo: A Transfer Learning Approach For Neural Network Based Conversational Agents Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
- The Second Conversational Intelligence Challenge (convai2) Emily Dinan et al.
- An Effective Domain Adaptive Post-training Method For BERT In Response Selection Taesun Whang et al.
- The Bottom-up Evolution Of Representations In The Transformer: A Study With Machine Translation And Language Modeling Objectives Elena Voita, Rico Sennrich, Ivan Titov
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- What Would Elsa Do? Freezing Layers During Transformer Fine-tuning Jaejun Lee, Raphael Tang, Jimmy Lin
- Exbert: A Visual Analysis Tool To Explore Learned Representations In Transformers Models Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann
- Adding Interpretable Attention To Neural Translation Models Improves Word Alignment Thomas Zenkel, Joern Wuebker, John Denero
- VL-BERT: Pre-training Of Generic Visual-linguistic Representations Weijie Su et al.
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Analyzing Multi-head Self-attention: Specialized Heads Do The Heavy Lifting, The Rest Can Be Pruned Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- The Evolved Transformer David R. So, Chen Liang, Quoc V. Le
- Insertion-based Decoding With Automatically Inferred Generation Order Jiatao Gu, Qi Liu, Kyunghyun Cho
- Scheduled Sampling For Transformers Tsvetomila Mihaylova, André F. T. Martins
- Levenshtein Transformer Jiatao Gu, Changhan Wang, Jake Zhao
- Interpreting And Improving Natural-language Processing (in Machines) With Natural Language-processing (in The Brain) Mariya Toneva, Leila Wehbe
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- CTRL: A Conditional Transformer Language Model For Controllable Generation Nitish Shirish Keskar, Bryan Mccann, Lav R. Varshney, Caiming Xiong, Richard Socher
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Stabilizing Transformers For Reinforcement Learning Emilio Parisotto et al.
- Dialogpt: Large-scale Generative Pre-training For Conversational Response Generation Yizhe Zhang et al.
- Deep Learning Based Chatbot Models Richard Csaky
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Parameter-efficient Transfer Learning For NLP Neil Houlsby et al.
- Controlling The Output Length Of Neural Machine Translation Surafel Melaku Lakew, Mattia Di Gangi, Marcello Federico
- Bp-transformer: Modelling Long-range Context Via Binary Partitioning Zihao Ye, Qipeng Guo, Quan Gan, Xipeng Qiu, Zheng Zhang
- Do Attention Heads In BERT Track Syntactic Dependencies? Phu Mon Htut, Jason Phang, Shikha Bordia, Samuel R. Bowman
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- Attentive History Selection For Conversational Question Answering Chen Qu et al.
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Augmenting Self-attention With Persistent Memory Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Herve Jegou, Armand Joulin
- Adaptive Attention Span In Transformers Sainbayar Sukhbaatar, Edouard Grave, Piotr Bojanowski, Armand Joulin
- Learning And Evaluating Contextual Embedding Of Source Code Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
- Linguistic Knowledge And Transferability Of Contextual Representations Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Plug And Play Language Models: A Simple Approach To Controlled Text Generation Sumanth Dathathri et al.
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Adapting And Evaluating A Deep Learning Language Model For Clinical Why-question Answering Andrew Wen, Mohamed Y. Elwazir, Sungrim Moon, Jungwei Fan
- Fast Transformer Decoding: One Write-head Is All You Need Noam Shazeer
- Semantically Conditioned Dialog Response Generation Via Hierarchical Disentangled Self-attention Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, William Yang Wang
- Bridging The Gap For Tokenizer-free Language Models Dokook Choe, Rami Al-rfou, Mandy Guo, Heeyoung Lee, Noah Constant
- Do Neural Dialog Systems Use The Conversation History Effectively? An Empirical Study Chinnadhurai Sankar, Sandeep Subramanian, Christopher Pal, Sarath Chandar, Yoshua Bengio
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Context-aware Learning For Neural Machine Translation Sébastien Jean, Kyunghyun Cho
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- Microsoft Translator At WMT 2019: Towards Large-scale Document-level Neural Machine Translation Marcin Junczys-dowmunt
- Synchronous Bidirectional Inference For Neural Sequence Generation Jiajun Zhang, Long Zhou, Yang Zhao, Chengqing Zong
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Text Infilling Wanrong Zhu, Zhiting Hu, Eric Xing
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- TANDA: Transfer And Adapt Pre-trained Transformer Models For Answer Sentence Selection Siddhant Garg, Thuy Vu, Alessandro Moschitti
- Incremental Transformer With Deliberation Decoder For Document Grounded Conversations Zekang Li et al.
- Exploring The Limits Of Transfer Learning With A Unified Text-to-text Transformer Colin Raffel et al.
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- Encoder-agnostic Adaptation For Conditional Language Generation Zachary M. Ziegler, Luke Melas-kyriazi, Sebastian Gehrmann, Alexander M. Rush
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- Sg-net: Syntax-guided Machine Reading Comprehension Zhuosheng Zhang et al.
- Modifying Memories In Transformer Models Chen Zhu et al.
- Colake: Contextualized Language And Knowledge Embedding Tianxiang Sun et al.
- Low-rank Bottleneck In Multi-head Attention Models Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
- Open-retrieval Conversational Question Answering Chen Qu et al.
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- SEAL: Segment-wise Extractive-abstractive Long-form Text Summarization Yao Zhao, Mohammad Saleh, Peter J. Liu
- Linformer: Self-attention With Linear Complexity Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, Hao Ma
- Measuring Systematic Generalization In Neural Proof Generation With Transformers Nicolas Gontier, Koustuv Sinha, Siva Reddy, Christopher Pal
- SPARTA: Efficient Open-domain Question Answering Via Sparse Transformer Matching Retrieval Tiancheng Zhao, Xiaopeng Lu, Kyusong Lee
- Transformers As Soft Reasoners Over Language Peter Clark, Oyvind Tafjord, Kyle Richardson
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- Sequential Latent Knowledge Selection For Knowledge-grounded Dialogue Byeongchang Kim, Jaewoo Ahn, Gunhee Kim
- Russiansuperglue: A Russian Language Understanding Evaluation Benchmark Tatiana Shavrina et al.
- KRISP: Integrating Implicit And Symbolic Knowledge For Open-domain Knowledge-based VQA Kenneth Marino, Xinlei Chen, Devi Parikh, Abhinav Gupta, Marcus Rohrbach
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- The Chess Transformer: Mastering Play Using Generative Language Models David Noever, Matt Ciolino, Josh Kalin
- Conversational Question Reformulation Via Sequence-to-sequence Architectures And Pretrained Language Models Sheng-chieh Lin et al.
- Pretrained Transformers For Simple Question Answering Over Knowledge Graphs D. Lukovnikov, A. Fischer, J. Lehmann
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- How Effective Is Task-agnostic Data Augmentation For Pretrained Transformers? Shayne Longpre, Yu Wang, Christopher Dubois
- Pymt5: Multi-mode Translation Of Natural Language And Python Code With Transformers Colin B. Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, Neel Sundaresan
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Deebert: Dynamic Early Exiting For Accelerating BERT Inference Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin
- Unnatural Language Inference Koustuv Sinha, Prasanna Parthasarathi, Joelle Pineau, Adina Williams
- Knowledge-aware Language Model Pretraining Corby Rosset et al.
- Mapping Natural Language Instructions To Mobile UI Action Sequences Yang Li, Jiacong He, Xin Zhou, Yuan Zhang, Jason Baldridge
- Code Prediction By Feeding Trees To Transformers Seohyun Kim, Jinman Zhao, Yuchi Tian, Satish Chandra
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- Gpt-too: A Language-model-first Approach For Amr-to-text Generation Manuel Mager et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Compressing Large-scale Transformer-based Models: A Case Study On BERT Prakhar Ganesh et al.
- IART: Intent-aware Response Ranking With Transformers In Information-seeking Conversation Systems Liu Yang et al.
- EDITOR: An Edit-based Transformer With Repositioning For Neural Machine Translation With Soft Lexical Constraints Weijia Xu, Marine Carpuat
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- Sequence-level Mixed Sample Data Augmentation Demi Guo, Yoon Kim, Alexander M. Rush
- A Simple But Tough-to-beat Data Augmentation Approach For Natural Language Understanding And Generation Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- Gshard: Scaling Giant Models With Conditional Computation And Automatic Sharding Dmitry Lepikhin et al.
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- Rapidly Bootstrapping A Question Answering Dataset For COVID-19 Raphael Tang et al.
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- Query Resolution For Conversational Search With Limited Supervision Nikos Voskarides, Dan Li, Pengjie Ren, Evangelos Kanoulas, Maarten De Rijke
- Visbert: Hidden-state Visualizations For Transformers Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Long Range Arena: A Benchmark For Efficient Transformers Yi Tay et al.
- Training Large Neural Networks With Constant Memory Using A New Execution Algorithm Bharadwaj Pudipeddi, Maral Mesmakhosroshahi, Jinwen Xi, Sujeeth Bharadwaj
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- Robust Conversational AI With Grounded Text Generation Jianfeng Gao et al.
- Synthesizer: Rethinking Self-attention In Transformer Models Yi Tay et al.
- When Do You Need Billions Of Words Of Pretraining Data? Yian Zhang, Alex Warstadt, Haau-sing Li, Samuel R. Bowman
- Bert-hlstms: BERT And Hierarchical Lstms For Visual Storytelling Jing Su, Qingyun Dai, Frank Guerin, Mian Zhou
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- MART: Memory-augmented Recurrent Transformer For Coherent Video Paragraph Captioning Jie Lei et al.
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- A Comparison Of LSTM And BERT For Small Corpus Aysu Ezen-can
- GMAT: Global Memory Augmentation For Transformers Ankit Gupta, Jonathan Berant
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- Adapterdrop: On The Efficiency Of Adapters In Transformers Andreas Rücklé et al.
- Natural Language Rationales With Full-stack Visual Reasoning: From Pixels To Semantic Frames To Commonsense Graphs Ana Marasović et al.
- End-to-end Synthetic Data Generation For Domain Adaptation Of Question Answering Systems Siamak Shakeri et al.
- Ernie-doc: A Retrospective Long-document Modeling Transformer Siyu Ding et al.
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- Proofwriter: Generating Implications, Proofs, And Abductive Statements Over Natural Language Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark
- Cocon: A Self-supervised Approach For Controlled Text Generation Alvin Chan, Yew-soon Ong, Bill Pung, Aston Zhang, Jie Fu
- Leap-of-thought: Teaching Pre-trained Models To Systematically Reason Over Implicit Knowledge Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Non-autoregressive Machine Translation With Disentangled Context Transformer Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
- Coregen: Contextualized Code Representation Learning For Commit Message Generation Lun Yiu Nie et al.
- TRANS-BLSTM: Transformer With Bidirectional LSTM For Language Understanding Zhiheng Huang, Peng Xu, Davis Liang, Ajay Mishra, Bing Xiang
- Auto-captions On GIF: A Large-scale Video-sentence Dataset For Vision-language Pre-training Yingwei Pan et al.
- Intellicode Compose: Code Generation Using Transformer Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan
- GREEK-BERT: The Greeks Visiting Sesame Street John Koutsikakis, Ilias Chalkidis, Prodromos Malakasiotis, Ion Androutsopoulos
- Improving Natural Language Processing Tasks With Human Gaze-guided Neural Attention Ekta Sood, Simon Tannert, Philipp Mueller, Andreas Bulling
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training Weizhen Qi et al.
- The Cascade Transformer: An Application For Efficient Answer Sentence Selection Luca Soldaini, Alessandro Moschitti
- Automated Source Code Generation And Auto-completion Using Deep Learning: Comparing And Discussing Current Language-model-related Approaches Juan Cruz-benito, Sanjay Vishwakarma, Francisco Martin-fernandez, Ismael Faro
- Encoding Syntactic Knowledge In Transformer Encoder For Intent Detection And Slot Filling Jixuan Wang, Kai Wei, Martin Radfar, Weiwei Zhang, Clement Chung
- Look Before You Speak: Visually Contextualized Utterances Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid
- ETC: Encoding Long And Structured Inputs In Transformers Joshua Ainslie et al.
- Simplifying Paragraph-level Question Generation Via Transformer Language Models Luis Enrico Lopez, Diane Kathryn Cruz, Jan Christian Blaise Cruz, Charibeth Cheng
- Big Bird: Transformers For Longer Sequences Manzil Zaheer et al.
- Assessing Phrasal Representation And Composition In Transformers Lang Yu, Allyson Ettinger
- Dialoguetrm: Exploring The Intra- And Inter-modal Emotional Behaviors In The Conversation Yuzhao Mao et al.
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- Hard-coded Gaussian Attention For Neural Machine Translation Weiqiu You, Simeng Sun, Mohit Iyyer
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Probing Pretrained Language Models For Lexical Semantics Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
- How Fine Can Fine-tuning Be? Learning Efficient Language Models Evani Radiya-dixit, Xin Wang
- Calibration Of Pre-trained Transformers Shrey Desai, Greg Durrett
- Variational Transformers For Diverse Response Generation Zhaojiang Lin, Genta Indra Winata, Peng Xu, Zihan Liu, Pascale Fung
- Turngpt: A Transformer-based Language Model For Predicting Turn-taking In Spoken Dialog Erik Ekstedt, Gabriel Skantze
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- A Transformer-based Approach For Source Code Summarization Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-wei Chang
- CERT: Contrastive Self-supervised Learning For Language Understanding Hongchao Fang, Sicheng Wang, Meng Zhou, Jiayuan Ding, Pengtao Xie
- XLM-T: Scaling Up Multilingual Machine Translation With Pretrained Cross-lingual Transformer Encoders Shuming Ma et al.
- A Controllable Model Of Grounded Response Generation Zeqiu Wu et al.
- Length-adaptive Transformer: Train Once With Length Drop, Use Anytime With Search Gyuwan Kim, Kyunghyun Cho
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- Contrastive Triple Extraction With Generative Transformer Hongbin Ye et al.
- Aragpt2: Pre-trained Transformer For Arabic Language Generation Wissam Antoun, Fady Baly, Hazem Hajj
- Very Deep Transformers For Neural Machine Translation Xiaodong Liu, Kevin Duh, Liyuan Liu, Jianfeng Gao
- Mixup-transformer: Dynamic Data Augmentation For NLP Tasks Lichao Sun et al.
- Mobilebert: A Compact Task-agnostic BERT For Resource-limited Devices Zhiqing Sun et al.
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- Document Ranking With A Pretrained Sequence-to-sequence Model Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Indic-transformers: An Analysis Of Transformer Language Models For Indian Languages Kushal Jain, Adwait Deshpande, Kumar Shridhar, Felix Laumann, Ayushman Dash
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Emptransfo: A Multi-head Transformer Architecture For Creating Empathetic Dialog Systems Rohola Zandie, Mohammad H. Mahoor
- DSTC8-AVSD: Multimodal Semantic Transformer Network With Retrieval Style Word Generator Hwanhee Lee et al.
- On The Effect Of Dropping Layers Of Pre-trained Transformer Models Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- HAT: Hardware-aware Transformers For Efficient Natural Language Processing Hanrui Wang et al.
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- Delight: Deep And Light-weight Transformer Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
- On Optimal Transformer Depth For Low-resource Language Translation Elan Van Biljon, Arnu Pretorius, Julia Kreutzer
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Mt5: A Massively Multilingual Pre-trained Text-to-text Transformer Linting Xue et al.
- Data Augmentation Using Pre-trained Transformer Models Varun Kumar, Ashutosh Choudhary, Eunah Cho
- Rethinking The Value Of Transformer Components Wenxuan Wang, Zhaopeng Tu
- As Good As New. How To Successfully Recycle English GPT-2 To Make Models For Other Languages Wietse De Vries, Malvina Nissim
- Funnel-transformer: Filtering Out Sequential Redundancy For Efficient Language Processing Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- Retrofitting Structure-aware Transformer Language Model For End Tasks Hao Fei, Yafeng Ren, Donghong Ji
- Minilmv2: Multi-head Self-attention Relation Distillation For Compressing Pretrained Transformers Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei
- An Exploratory Study On Long Dialogue Summarization: What Works And What's Next Yusen Zhang et al.
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- Contrastive Learning For Many-to-many Multilingual Neural Machine Translation Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
- G-transformer For Document-level Machine Translation Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo
- Unifying Multimodal Transformer For Bi-directional Image And Text Generation Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- Improving Stack Overflow Question Title Generation With Copying Enhanced Codebert Model And Bi-modal Information Fengji Zhang et al.
- Progressive Transformer-based Generation Of Radiology Reports Farhad Nooralahzadeh, Nicolas Perez Gonzalez, Thomas Frauenfelder, Koji Fujimoto, Michael Krauthammer
- Mention Memory: Incorporating Textual Knowledge Into Transformers Through Entity Mention Attention Michiel De Jong, Yury Zemlyanskiy, Nicholas Fitzgerald, Fei Sha, William Cohen
- Robeczech: Czech Roberta, A Monolingual Contextualized Language Representation Model Milan Straka, Jakub Náplava, Jana Straková, David Samuel
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- Advancing High-resolution Video-language Representation With Large-scale Video Transcriptions Hongwei Xue et al.
- Wangchanberta: Pretraining Transformer-based Thai Language Models Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai, Sarana Nutanong
- Generic Attention-model Explainability For Interpreting Bi-modal And Encoder-decoder Transformers Hila Chefer, Shir Gur, Lior Wolf
- Explaining Documents' Relevance To Search Queries Razieh Rahimi, Youngwoo Kim, Hamed Zamani, James Allan
- Transformer-based Conditional Variational Autoencoder For Controllable Story Generation Le Fang et al.
- Learning Rich Representation Of Keyphrases From Text Mayank Kulkarni, Debanjan Mahata, Ravneet Arora, Rajarshi Bhowmik
- Personalized Transformer For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Vision-and-language Or Vision-for-language? On Cross-modal Influence In Multimodal Transformers Stella Frank, Emanuele Bugliarello, Desmond Elliott
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- KAT: A Knowledge Augmented Transformer For Vision-and-language Liangke Gui et al.
- Pretrained Transformers As Universal Computation Engines Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
- One Chatbot Per Person: Creating Personalized Chatbots Based On Implicit User Profiles Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, Ji-rong Wen
- Using Prior Knowledge To Guide Bert's Attention In Semantic Textual Matching Tasks Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
- GPT-3 Models Are Poor Few-shot Learners In The Biomedical Domain Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald
- Conversational Question Answering Over Knowledge Graphs With Transformer And Graph Attention Networks Endri Kacupaj et al.
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Arat5: Text-to-text Transformers For Arabic Language Generation El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-mageed
- Investigating The Limitations Of Transformers With Simple Arithmetic Tasks Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- Sequence Length Is A Domain: Length-based Overfitting In Transformer Models Dušan Variš, Ondřej Bojar
- Text Compression-aided Transformer Encoding Zuchao Li et al.
- Trankit: A Light-weight Transformer-based Toolkit For Multilingual Natural Language Processing Minh Van Nguyen, Viet Dac Lai, Amir Pouran Ben Veyseh, Thien Huu Nguyen
- Clip4caption: CLIP For Video Caption Mingkang Tang et al.
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Self-guided Contrastive Learning For BERT Sentence Representations Taeuk Kim, Kang Min Yoo, Sang-goo Lee
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- Compressing Visual-linguistic Model Via Knowledge Distillation Zhiyuan Fang et al.
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- Adaptive Semiparametric Language Models Dani Yogatama, Cyprien De Masson D'autume, Lingpeng Kong
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- Long-span Summarization Via Local Attention And Content Selection Potsawee Manakul, Mark J. F. Gales
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Multimodal Transformer With Variable-length Memory For Vision-and-language Navigation Chuang Lin et al.
- Fastformer: Additive Attention Can Be All You Need Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang, Xing Xie
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- Prompting Visual-language Models For Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
- Terapipe: Token-level Pipeline Parallelism For Training Large-scale Language Models Zhuohan Li et al.
- N\"UWA: Visual Synthesis Pre-training For Neural Visual World Creation Chenfei Wu et al.
- Non-invasive Self-attention For Side Information Fusion In Sequential Recommendation Chang Liu et al.
- Climatebert: A Pretrained Language Model For Climate-related Text Nicolas Webersinke, Mathias Kraus, Julia Anna Bingler, Markus Leippold
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- Scale Efficiently: Insights From Pre-training And Fine-tuning Transformers Yi Tay et al.
- Are Pre-trained Convolutions Better Than Pre-trained Transformers? Yi Tay et al.
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- Multilingual Language Models Predict Human Reading Behavior Nora Hollenstein, Federico Pirovano, Ce Zhang, Lena Jäger, Lisa Beinborn
- Hierarchical Task Learning From Language Instructions With Unified Transformers And Self-monitoring Yichi Zhang, Joyce Chai
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Human Parity On Commonsenseqa: Augmenting Self-attention With External Attention Yichong Xu et al.
- What Do Pre-trained Code Models Know About Code? Anjan Karmakar, Romain Robbes
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- Wordcraft: A Human-ai Collaborative Editor For Story Writing Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- Demix Layers: Disentangling Domains For Modular Language Modeling Suchin Gururangan, Mike Lewis, Ari Holtzman, Noah A. Smith, Luke Zettlemoyer
- Episodic Transformer For Vision-and-language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun
- KM-BART: Knowledge Enhanced Multimodal BART For Visual Commonsense Generation Yiran Xing et al.
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- MT6: Multilingual Pretrained Text-to-text Transformer With Translation Pairs Zewen Chi et al.
- Quiz-style Question Generation For News Stories Adam D. Lelkes, Vinh Q. Tran, Cong Yu
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Embodied BERT: A Transformer Model For Embodied, Language-guided Visual Task Completion Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme
- Visqa: X-raying Vision And Language Reasoning In Transformers Theo Jaunet et al.
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Cotext: Multi-task Learning With Code-text Transformer Long Phan et al.
- Rome Was Built In 1776: A Case Study On Factual Correctness In Knowledge-grounded Response Generation Sashank Santhanam et al.
- Scifive: A Text-to-text Transformer Model For Biomedical Literature Long N. Phan et al.
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- Learned Token Pruning For Transformers Sehoon Kim et al.
- I-BERT: Integer-only BERT Quantization Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
- Sentence-t5: Scalable Sentence Encoders From Pre-trained Text-to-text Models Jianmo Ni et al.
- Taming Sparsely Activated Transformer With Stochastic Experts Simiao Zuo et al.
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- Condenser: A Pre-training Architecture For Dense Retrieval Luyu Gao, Jamie Callan
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- Planning With Learned Entity Prompts For Abstractive Summarization Shashi Narayan et al.
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Code Structure Guided Transformer For Source Code Summarization Shuzheng Gao et al.
- Show Your Work: Scratchpads For Intermediate Computation With Language Models Maxwell Nye et al.
- Scaling Language Models: Methods, Analysis & Insights From Training Gopher Jack W. Rae et al.
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Longt5: Efficient Text-to-text Transformer For Long Sequences Mandy Guo et al.
- How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty In Text Generation Using RAVEN R. Thomas Mccoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- Dialogue History Matters! Personalized Response Selectionin Multi-turn Retrieval-based Chatbots Juntao Li et al.
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Byt5: Towards A Token-free Future With Pre-trained Byte-to-byte Models Linting Xue et al.
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- MATE: Multi-view Attention For Table Transformer Efficiency Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, William W. Cohen
- Robertuito: A Pre-trained Language Model For Social Media Text In Spanish Juan Manuel Pérez, Damián A. Furman, Laura Alonso Alemany, Franco Luque
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- A Comparative Study Of Transformer-based Language Models On Extractive Question Answering Kate Pearce, Tiffany Zhan, Aneesh Komanduri, Justin Zhan
- Training Verifiers To Solve Math Word Problems Karl Cobbe et al.
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Reacc: A Retrieval-augmented Code Completion Framework Shuai Lu et al.
- Vision-and-language Pretrained Models: A Survey Siqu Long, Feiqi Cao, Soyeon Caren Han, Haiqin Yang
- Murag: Multimodal Retrieval-augmented Generator For Open Question Answering Over Images And Text Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen
- Cogvideo: Large-scale Pretraining For Text-to-video Generation Via Transformers Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- A Length-extrapolatable Transformer Yutao Sun et al.
- Contrastive Learning With Bidirectional Transformers For Sequential Recommendation Hanwen Du et al.
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- Pali: A Jointly-scaled Multilingual Language-image Model Xi Chen et al.
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- LUT-GEMM: Quantized Matrix Multiplication Based On Luts For Efficient Inference In Large-scale Generative Language Models Gunho Park et al.
- Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored To Political Identity Gabriel Simmons
- Hitskt: A Hierarchical Transformer Model For Session-aware Knowledge Tracing Fucai Ke et al.
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Mass-editing Memory In A Transformer Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Minicons: Enabling Flexible Behavioral And Representational Analyses Of Transformer Language Models Kanishka Misra
- VLC-BERT: Visual Question Answering With Contextualized Commonsense Knowledge Sahithya Ravi, Aditya Chinchure, Leonid Sigal, Renjie Liao, Vered Shwartz
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Training Compute-optimal Large Language Models Jordan Hoffmann et al.
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Towards Trustworthy Autograding Of Short, Multi-lingual, Multi-type Answers Johannes Schneider, Robin Richner, Micha Riser
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- Incorporating Domain Knowledge Through Task Augmentation For Front-end Javascript Code Generation Sijie Shen et al.
- RASAT: Integrating Relational Structures Into Pretrained Seq2seq Model For Text-to-sql Jiexing Qi et al.
- Unified-io: A Unified Model For Vision, Language, And Multi-modal Tasks Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
- Lilt: A Simple Yet Effective Language-independent Layout Transformer For Structured Document Understanding Jiapeng Wang, Lianwen Jin, Kai Ding
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Improving The Domain Adaptation Of Retrieval Augmented Generation (RAG) Models For Open Domain Question Answering Shamane Siriwardhana et al.
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- Coca: Contrastive Captioners Are Image-text Foundation Models Jiahui Yu et al.
- Gtrans: Grouping And Fusing Transformer Layers For Neural Machine Translation Jian Yang et al.
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- BERTIN: Efficient Pre-training Of A Spanish Language Model Using Perplexity Sampling Javier De La Rosa et al.
- Dall-eval: Probing The Reasoning Skills And Social Biases Of Text-to-image Generation Models Jaemin Cho, Abhay Zala, Mohit Bansal
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Vit5: Pretrained Text-to-text Transformer For Vietnamese Language Generation Long Phan, Hieu Tran, Hieu Nguyen, Trieu H. Trinh
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- Efficient Few-shot Learning Without Prompts Lewis Tunstall et al.
- Phenaki: Variable Length Video Generation From Open Domain Textual Description Ruben Villegas et al.
- Personalized Prompt Learning For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Data Distributional Properties Drive Emergent In-context Learning In Transformers Stephanie C. Y. Chan et al.
- Efficient Long-text Understanding With Short-text Models Maor Ivgi, Uri Shaham, Jonathan Berant
- Language Models Are Realistic Tabular Data Generators Vadim Borisov, Kathrin Seßler, Tobias Leemann, Martin Pawelczyk, Gjergji Kasneci
- OPT: Open Pre-trained Transformer Language Models Susan Zhang et al.
- Visual Prompt Tuning Menglin Jia et al.
- Can Machines Help Us Answering Question 16 In Datasheets, And In Turn Reflecting On Inappropriate Content? Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- Hyperprompt: Prompt-based Task-conditioning Of Transformers Yun He et al.
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Block-recurrent Transformers Delesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur
- Future Transformer For Long-term Action Anticipation Dayoung Gong, Joonseok Lee, Manjin Kim, Seong Jong Ha, Minsu Cho
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- Democratizing Contrastive Language-image Pre-training: A CLIP Benchmark Of Data, Model, And Supervision Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao
- Competition-level Code Generation With Alphacode Yujia Li et al.
- Fast Inference From Transformers Via Speculative Decoding Yaniv Leviathan, Matan Kalman, Yossi Matias
- Mplug: Effective And Efficient Vision-language Learning By Cross-modal Skip-connections Chenliang Li et al.
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- Exploring Length Generalization In Large Language Models Cem Anil et al.
- In-context Learning And Induction Heads Catherine Olsson et al.
- A Survey On Model Compression And Acceleration For Pretrained Language Models Canwen Xu, Julian Mcauley
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- Why Does Surprisal From Larger Transformer-based Language Models Provide A Poorer Fit To Human Reading Times? Byung-doh Oh, William Schuler
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- Survey Of Hallucination In Natural Language Generation Ziwei Ji et al.
- Super-naturalinstructions: Generalization Via Declarative Instructions On 1600+ NLP Tasks Yizhong Wang et al.
- BLOOM: A 176b-parameter Open-access Multilingual Language Model Bigscience Workshop et al.
- Memorizing Transformers Yuhuai Wu, Markus N. Rabe, Delesley Hutchins, Christian Szegedy
- St-moe: Designing Stable And Transferable Sparse Expert Models Barret Zoph et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Retrieval Augmentation Of Large Language Models For Lay Language Generation Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- Recurrent Memory Transformer Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
- T-NER: An All-round Python Library For Transformer-based Named Entity Recognition Asahi Ushio, Jose Camacho-collados
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- Clinical-longformer And Clinical-bigbird: Transformers For Long Clinical Sequences Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Language Model Compression With Weighted Low-rank Factorization Yen-chang Hsu et al.
- A Model-agnostic Data Manipulation Method For Persona-based Dialogue Generation Yu Cao, Wei Bi, Meng Fang, Shuming Shi, Dacheng Tao
- Empowering Language Models With Knowledge Graph Reasoning For Question Answering Ziniu Hu et al.
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Storydall-e: Adapting Pretrained Text-to-image Transformers For Story Continuation Adyasha Maharana, Darryl Hannan, Mohit Bansal
- Transformer Language Models Without Positional Encodings Still Learn Positional Information Adi Haviv, Ori Ram, Ofir Press, Peter Izsak, Omer Levy
- A New Path: Scaling Vision-and-language Navigation With Synthetic Instructions And Imitation Learning Aishwarya Kamath et al.
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models Aarohi Shammie Srivastava et al.
- TALM: Tool Augmented Language Models Aaron Parisi, Yao Zhao, Noah Fiedel
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- Generative Spoken Dialogue Language Modeling Tu Anh Nguyen et al.
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Arabart: A Pretrained Arabic Sequence-to-sequence Model For Abstractive Summarization Moussa Kamal Eddine, Nadi Tomeh, Nizar Habash, Joseph Le Roux, Michalis Vazirgiannis
- Transformer Feed-forward Layers Build Predictions By Promoting Concepts In The Vocabulary Space Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
- Llm.int8(): 8-bit Matrix Multiplication For Transformers At Scale Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer
- An Empirical Study Of End-to-end Video-language Transformers With Masked Visual Modeling Tsu-jui Fu et al.
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Training And Evaluating A Jupyter Notebook Data Science Assistant Shubham Chandel, Colin B. Clement, Guillermo Serrato, Neel Sundaresan
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Confident Adaptive Language Modeling Tal Schuster et al.
- Meta Policy Learning For Cold-start Conversational Recommendation Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- CTRAN: Cnn-transformer-based Network For Natural Language Understanding Mehrdad Rafiepour, Javad Salimi Sartakhti
- Do Large Language Models Resemble Humans In Language Use? Zhenguang G. Cai, Xufeng Duan, David A. Haslett, Shuqi Wang, Martin J. Pickering
- Natural Language Generation And Understanding Of Big Code For Ai-assisted Programming: A Review Man Fai Wong, Shangxin Guo, Ching Nam Hang, Siu Wai Ho, Chee Wei Tan
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- Deep Learning Mental Health Dialogue System Lennart Brocki, George C. Dyer, Anna Gładka, Neo Christopher Chung
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- GQA: Training Generalized Multi-query Transformer Models From Multi-head Checkpoints Joshua Ainslie et al.
- LEXTREME: A Multi-lingual And Multi-task Benchmark For The Legal Domain Joel Niklaus et al.
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- Graphix-t5: Mixing Pre-trained Transformers With Graph-aware Layers For Text-to-sql Parsing Jinyang Li et al.
- Longnet: Scaling Transformers To 1,000,000,000 Tokens Jiayu Ding et al.
- Unified-io 2: Scaling Autoregressive Multimodal Models With Vision, Language, Audio, And Action Jiasen Lu et al.
- LLM Lies: Hallucinations Are Not Bugs, But Features As Adversarial Examples Jia-yu Yao et al.
- Unlearn What You Want To Forget: Efficient Unlearning For Llms Jiaao Chen, Diyi Yang
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- Simple And Controllable Music Generation Jade Copet et al.
- Evaluation Of Chatgpt On Biomedical Tasks: A Zero-shot Comparison With Fine-tuned Generative Transformers Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- Muse: Text-to-image Generation Via Masked Generative Transformers Huiwen Chang et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- Recommender Systems With Generative Retrieval Shashank Rajput et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- Chatgpt Or Human? Detect And Explain. Explaining Decisions Of Machine Learning Model For Detecting Short Chatgpt-generated Text Sandra Mitrović, Davide Andreoletti, Omran Ayoub
- Let's Have A Chat! A Conversation With Chatgpt: Technology, Applications, And Limitations Sakib Shahriar, Kadhim Hayawi
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- Palm 2 Technical Report Rohan Anil et al.
- In-context Learning Creates Task Vectors Roee Hendel, Mor Geva, Amir Globerson
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Grounded Text-to-image Synthesis With Attention Refocusing Quynh Phung, Songwei Ge, Jia-bin Huang
- Medcpt: Contrastive Pre-trained Transformers With Large-scale Pubmed Search Logs For Zero-shot Biomedical Information Retrieval Qiao Jin et al.
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- GPT-4 Technical Report Openai et al.
- Faith And Fate: Limits Of Transformers On Compositionality Nouha Dziri et al.
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- Hyena Hierarchy: Towards Larger Convolutional Language Models Michael Poli et al.
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Textbooks Are All You Need Suriya Gunasekar et al.
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Automated Reading Passage Generation With Openai's Large Language Model Ummugul Bezirhan, Matthias Von Davier
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Automatic Semantic Augmentation Of Language Model Prompts (for Code Summarization) Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Text Matching Improves Sequential Recommendation By Reducing Popularity Biases Zhenghao Liu et al.
- Moviechat: From Dense Token To Sparse Memory For Long Video Understanding Enxin Song et al.
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- MELTR: Meta Loss Transformer For Learning To Fine-tune Video Foundation Models Dohwan Ko et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- Chatgpt And A New Academic Reality: Artificial Intelligence-written Research Papers And The Ethics Of The Large Language Models In Scholarly Publishing Brady Lund et al.
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- Scaling Transformer To 1M Tokens And Beyond With RMT Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, Mikhail S. Burtsev
- Chatgpt: Applications, Opportunities, And Threats Aram Bahrini et al.
- How Good Are GPT Models At Machine Translation? A Comprehensive Evaluation Amr Hendy et al.
- The Impact Of Positional Encoding On Length Generalization In Transformers Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- A Comparative Study Of Pretrained Language Models For Long Clinical Text Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- Pali-3 Vision Language Models: Smaller, Faster, Stronger Xi Chen et al.
- The Unreasonable Effectiveness Of Few-shot Learning For Machine Translation Xavier Garcia et al.
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- Textbooks Are All You Need II: Phi-1.5 Technical Report Yuanzhi Li et al.
- Transformers Are Ssms: Generalized Models And Efficient Algorithms Through Structured State Space Duality Tri Dao, Albert Gu
- The Era Of 1-bit Llms: All Large Language Models Are In 1.58 Bits Shuming Ma et al.
- Hidden Flaws Behind Expert-level Accuracy Of Multimodal GPT-4 Vision In Medicine Qiao Jin et al.
- From Text To Transformation: A Comprehensive Review Of Large Language Models' Versatility Pravneet Kaur et al.
- Jamba: A Hybrid Transformer-mamba Language Model Opher Lieber et al.
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- History Of Generative Artificial Intelligence (AI) Chatbots: Past, Present, And Future Development Md. Al-amin et al.
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- Xlstm: Extended Long Short-term Memory Maximilian Beck et al.
- Language Models For Code Completion: A Practical Evaluation Maliheh Izadi et al.
- Linrec: Linear Attention Mechanism For Long-term Sequential Recommender Systems Langming Liu et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Gemma 2: Improving Open Language Models At A Practical Size Gemma Team et al.
- AI And Memory Wall Amir Gholami et al.
- Yi: Open Foundation Models By 01.AI 01. Ai et al.
- Large Language Model (LLM) AI Text Generation Detection Based On Transformer Deep Learning Algorithm Yuhong Mo, Hao Qin, Yushan Dong, Ziyi Zhu, Zhenglin Li
🏷 Uncategorized
- Conversational Contextual Cues: The Case Of Personalization And History For Response Ranking Rami Al-rfou et al.
- Why Are Sequence-to-sequence Models So Dull? Understanding The Low-diversity Problem Of Chatbots Shaojie Jiang, Maarten De Rijke
- Flowqa: Grasping Flow In History For Conversational Machine Comprehension Hsin-yuan Huang, Eunsol Choi, Wen-tau Yih
- Lingke: A Fine-grained Multi-turn Chatbot For Customer Service Pengfei Zhu, Zhuosheng Zhang, Jiangtong Li, Yafang Huang, Hai Zhao
- Response Generation By Context-aware Prototype Editing Yu Wu et al.
- Negated And Misprimed Probes For Pretrained Language Models: Birds Can Talk, But Cannot Fly Nora Kassner, Hinrich Schütze
- Pre-trained Language Model Representations For Language Generation Sergey Edunov, Alexei Baevski, Michael Auli
- Episodic Memory In Lifelong Language Learning Cyprien De Masson D'autume, Sebastian Ruder, Lingpeng Kong, Dani Yogatama
- Lawformer: A Pre-trained Language Model For Chinese Legal Long Documents Chaojun Xiao, Xueyu Hu, Zhiyuan Liu, Cunchao Tu, Maosong Sun
- Human Heuristics For Ai-generated Language Are Flawed Maurice Jakesch, Jeffrey Hancock, Mor Naaman
- Turning Large Language Models Into Cognitive Models Marcel Binz, Eric Schulz
- Active Retrieval Augmented Generation Zhengbao Jiang et al.
- Understanding And Detecting Hallucinations In Neural Machine Translation Via Model Introspection Weijia Xu, Sweta Agrawal, Eleftheria Briakou, Marianna J. Martindale, Marine Carpuat
- Getting From Generative AI To Trustworthy AI: What Llms Might Learn From Cyc Doug Lenat, Gary Marcus
- Mol-instructions: A Large-scale Biomolecular Instruction Dataset For Large Language Models Yin Fang et al.
- Can Large Language Models Reason And Plan? Subbarao Kambhampati
🏷 Vector Indexing
🏷 WMT
- Attention Strategies For Multi-source Sequence-to-sequence Learning Jindřich Libovický, Jindřich Helcl
- Attention Is All You Need Ashish Vaswani et al.
- Weighted Transformer Network For Machine Translation Karim Ahmed, Nitish Shirish Keskar, Richard Socher
- The Memad Submission To The WMT18 Multimodal Translation Task Stig-arne Grönroos et al.
- "bilingual Expert" Can Find Translation Errors Kai Fan et al.
- Cross-lingual Language Model Pretraining Guillaume Lample, Alexis Conneau
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- A Tensorized Transformer For Language Modeling Xindian Ma et al.
- Pay Less Attention With Lightweight And Dynamic Convolutions Felix Wu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli
- Modeling Recurrence For Transformer Jie Hao et al.
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- Analyzing Multi-head Self-attention: Specialized Heads Do The Heavy Lifting, The Rest Can Be Pruned Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
- The Evolved Transformer David R. So, Chen Liang, Quoc V. Le
- On The Use Of BERT For Neural Machine Translation Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- Microsoft Translator At WMT 2019: Towards Large-scale Document-level Neural Machine Translation Marcin Junczys-dowmunt
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- XLM-T: Scaling Up Multilingual Machine Translation With Pretrained Cross-lingual Transformer Encoders Shuming Ma et al.
- Very Deep Transformers For Neural Machine Translation Xiaodong Liu, Kevin Duh, Liyuan Liu, Jianfeng Gao
- HAT: Hardware-aware Transformers For Efficient Natural Language Processing Hanrui Wang et al.
- Contrastive Learning For Many-to-many Multilingual Neural Machine Translation Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
- BERT, Mbert, Or Bibert? A Study On Contextualized Embeddings For Neural Machine Translation Haoran Xu, Benjamin Van Durme, Kenton Murray
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Gtrans: Grouping And Fusing Transformer Layers For Neural Machine Translation Jian Yang et al.
- Large Language Models Are State-of-the-art Evaluators Of Translation Quality Tom Kocmi, Christian Federmann
- The Unreasonable Effectiveness Of Few-shot Learning For Machine Translation Xavier Garcia et al.