The FacetedDBLP logo    Search for: in:

Disable automatic phrases ?     Syntactic query expansion: ?

Searching for tokenization with no syntactic query expansion in all metadata.

Publication years (Num. hits)
1992-2002 (17) 2003-2004 (17) 2005-2007 (25) 2008-2009 (18) 2010-2013 (22) 2014-2016 (24) 2017-2018 (20) 2019-2020 (42) 2021 (39) 2022 (43) 2023 (93) 2024 (25)
Publication types (Num. hits)
article(167) inproceedings(217) phdthesis(1)
GrowBag graphs for keyword ? (Num. hits/coverage)

Group by:
The graphs summarize 105 occurrences of 73 keywords

Results
Found 385 publication records. Showing 385 according to the selection in the facets
Hits ? Authors Title Venue Year Link Author keywords
143Jing Jiang 0007, ChengXiang Zhai An empirical study of tokenization strategies for biomedical information retrieval. Search on Bibsonomy Inf. Retr. The full citation details ... 2007 DBLP  DOI  BibTeX  RDF Stop word, Tokenization, Stemming, Biomedical information retrieval
109Dolf Trieschnigg, Wessel Kraaij, Franciska de Jong The influence of basic tokenization on biomedical document retrieval. Search on Bibsonomy SIGIR The full citation details ... 2007 DBLP  DOI  BibTeX  RDF biomedical document retrieval, tokenization, lexical analysis
75Rong Tong, Bin Ma 0001, Kong-Aik Lee, Changhuai You, Donglai Zhu, Tomi Kinnunen, Hanwu Sun, Minghui Dong, Chng Eng Siong, Haizhou Li 0001 Fusion of Acoustic and Tokenization Features for Speaker Recognition. Search on Bibsonomy ISCSLP The full citation details ... 2006 DBLP  DOI  BibTeX  RDF cepstral feature, phonotactic feature, support vector machine, Gaussian mixture model, fusion, tokenization, speaker recognition
75Thomas W. Reps "Maximal-munch" Tokenization in Linear Time. Search on Bibsonomy ACM Trans. Program. Lang. Syst. The full citation details ... 1998 DBLP  DOI  BibTeX  RDF tabulation, dynamic programming, backtracking, tokenization, memoization
73Eiman Tamah Al-Shammari, Jessica Lin 0001 A novel Arabic lemmatization algorithm. Search on Bibsonomy AND The full citation details ... 2008 DBLP  DOI  BibTeX  RDF text mining, tokenization, stemming, Arabic, lemmatization
56Paul McNamee, Charles K. Nicholas, James Mayfield Addressing morphological variation in alphabetic languages. Search on Bibsonomy SIGIR The full citation details ... 2009 DBLP  DOI  BibTeX  RDF morphology, tokenization, stemming, CLIR, character n-grams
52Paul McNamee, James Mayfield JHU/APL Experiments in Tokenization and Non-word Translation. Search on Bibsonomy CLEF The full citation details ... 2003 DBLP  DOI  BibTeX  RDF
51Daniele Paolo Scarpazza, Gregory F. Russell High-performance regular expression scanning on the Cell/B.E. processor. Search on Bibsonomy ICS The full citation details ... 2009 DBLP  DOI  BibTeX  RDF multi-core, regular expressions, cell processor
51Georgiana Ifrim, Gökhan H. Bakir, Gerhard Weikum Fast logistic regression for text categorization with variable-length n-grams. Search on Bibsonomy KDD The full citation details ... 2008 DBLP  DOI  BibTeX  RDF fast logistic regression, n-gram learning, variable-length n-grams, text categorization
41Doan Nguyen Query preprocessing: improving web search through a Vietnamese word tokenization approach. Search on Bibsonomy SIGIR The full citation details ... 2008 DBLP  DOI  BibTeX  RDF Vietnamese word tokenization, query preprocessing, search
39Daniel P. Lopresti Optical character recognition errors and their effects on natural language processing. Search on Bibsonomy AND The full citation details ... 2008 DBLP  DOI  BibTeX  RDF performance evaluation, optical character recognition, tokenization, part-of-speech tagging, sentence boundary detection
39Benoît Sagot Building a Morphosyntactic Lexicon and a Pre-syntactic Processing Chain for Polish. Search on Bibsonomy LTC The full citation details ... 2007 DBLP  DOI  BibTeX  RDF Morphosyntactic lexicon, pre-syntactic processing, named entites recognition, Polish language, tokenization, spelling correction
39Paul McNamee JHU/APL Ad Hoc Experiments at CLEF 2006. Search on Bibsonomy CLEF The full citation details ... 2006 DBLP  DOI  BibTeX  RDF character n-gram tokenization, corpus-based translation, Cross-language information retrieval
39John F. Pitrelli, Amit Roy Creating word-level language models for large-vocabulary handwriting recognition. Search on Bibsonomy Int. J. Document Anal. Recognit. The full citation details ... 2003 DBLP  DOI  BibTeX  RDF Unigram, Search, Language modeling, Handwriting recognition, Tokenization
39James Mayfield, Paul McNamee Single n-gram stemming. Search on Bibsonomy SIGIR The full citation details ... 2003 DBLP  DOI  BibTeX  RDF n-gram tokenization, stemming
37Aaditya K. Singh, DJ Strouse Tokenization counts: the impact of tokenization on arithmetic in frontier LLMs. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
37Anaelia Ovalle, Ninareh Mehrabi, Palash Goyal, Jwala Dhamala, Kai-Wei Chang, Richard S. Zemel, Aram Galstyan, Rahul Gupta 0001 Are you talking to ['xem'] or ['x', 'em']? On Tokenization and Addressing Misgendering in LLMs with Pronoun Tokenization Parity. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
37Murhaf Fares, Stephan Oepen, Yi Zhang 0003 Machine Learning for High-Quality Tokenization Replicating Variable Tokenization Schemes. Search on Bibsonomy CICLing (1) The full citation details ... 2013 DBLP  DOI  BibTeX  RDF
35Hamid Abdul Basit, Simon J. Puglisi, William F. Smyth, Andrew Turpin, Stan Jarzabek Efficient token based clone detection with flexible tokenization. Search on Bibsonomy ESEC/SIGSOFT FSE The full citation details ... 2007 DBLP  DOI  BibTeX  RDF token-based clone detection, reverse engineering, software maintenance, clone detection
35Jaap Kamps, Sisay Fissaha Adafre, Maarten de Rijke Effective Translation, Tokenization and Combination for Cross-Lingual Retrieval. Search on Bibsonomy CLEF The full citation details ... 2004 DBLP  DOI  BibTeX  RDF
35António Horta Branco, João Ricardo Silva Contractions: Breaking the Tokenization-Tagging Circularity. Search on Bibsonomy PROPOR The full citation details ... 2003 DBLP  DOI  BibTeX  RDF
35Jorge Graña, Francisco-Mario Barcala, Jesús Vilares Ferro Formal Methods of Tokenization for Part-of-Speech Tagging. Search on Bibsonomy CICLing The full citation details ... 2002 DBLP  DOI  BibTeX  RDF
34John F. Pitrelli, Amit Roy Creating Word-Level Language Models for Handwriting Recognition. Search on Bibsonomy ICDAR The full citation details ... 2001 DBLP  DOI  BibTeX  RDF
22Eiman Tamah Al-Shammari, Jessica Lin 0001 Towards an error-free Arabic stemming. Search on Bibsonomy CIKM-iNEWS The full citation details ... 2008 DBLP  DOI  BibTeX  RDF text mining, tokenization, stemming, Arabic, lemmatization
22Choochart Haruechaiyasak, Sarawoot Kongyoung, Chaianun Damrongrat LearnLexTo: a machine-learning based word segmentation for indexing Thai texts. Search on Bibsonomy CIKM-iNEWS The full citation details ... 2008 DBLP  DOI  BibTeX  RDF indexing, tokenization, word segmentation
22George Forman, Evan Kirshenbaum Extremely fast text feature extraction for classification and indexing. Search on Bibsonomy CIKM The full citation details ... 2008 DBLP  DOI  BibTeX  RDF feature engineering, text tokenization, feature extraction, text mining, text indexing, document categorization, bag-of-words
22Paul McNamee, Charles K. Nicholas, James Mayfield Don't have a stemmer?: be un+concern+ed. Search on Bibsonomy SIGIR The full citation details ... 2008 DBLP  DOI  BibTeX  RDF unsupervised morphological segmentation, tokenization, stemming, character n-grams
22Dan Tufis, Dan Stefanescu, Radu Ion, Alexandru Ceausu RACAI's Question Answering System at QA@CLEF2007. Search on Bibsonomy CLEF The full citation details ... 2007 DBLP  DOI  BibTeX  RDF tagging, question answering, tokenization, query generation, lemmatization, answer extraction, Lucene, indexing and retrieval
22Daniel P. Lopresti Performance evaluation for text processing of noisy inputs. Search on Bibsonomy SAC The full citation details ... 2005 DBLP  DOI  BibTeX  RDF performance evaluation, optical character recognition, tokenization, part-of-speech tagging, sentence boundary detection
22Paul McNamee, James Mayfield Translating pieces of words. Search on Bibsonomy SIGIR The full citation details ... 2005 DBLP  DOI  BibTeX  RDF character n-gram tokenization, translation, cross-language information retrieval, parallel corpora
22Robert C. Goldstein, Christian Wagner 0001 Database Management with Sequence Trees and Tokens. Search on Bibsonomy IEEE Trans. Knowl. Data Eng. The full citation details ... 1997 DBLP  DOI  BibTeX  RDF performance, design, Abstract data types, database management, tokenization, file organization
18Taewan Kim, Jinwoo Kim, Heeseok Oh, Jiwoo Kang Deep Transformer Based Video Inpainting Using Fast Fourier Tokenization. Search on Bibsonomy IEEE Access The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Yigit Bekir Kaya, A. Cüneyd Tantug Effect of tokenization granularity for Turkish large language models. Search on Bibsonomy Intell. Syst. Appl. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Pierre Poitier, Jérôme Fink, Benoît Frénay Towards better transition modeling in recurrent neural networks: The case of sign language tokenization. Search on Bibsonomy Neurocomputing The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Edo Dotan, Gal Jaschek, Tal Pupko, Yonatan Belinkov Effect of tokenization on transformers for biological sequences. Search on Bibsonomy Bioinform. The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Jinbiao Yang Rethinking Tokenization: Crafting Better Tokenizers for Large Language Models. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Ninglu Shao, Shitao Xiao, Zheng Liu, Peitian Zhang Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Qijiong Liu, Hengchang Hu, Jiahao Wu 0004, Jieming Zhu, Min-Yen Kan, Xiao-Ming Wu 0003 Discrete Semantic Tokenization for Deep CTR Prediction. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Omer Goldman, Avi Caciularu, Matan Eyal, Kris Cao, Idan Szpektor, Reut Tsarfaty Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Mahdi Biparva, Raika Karimi, Faezeh Faez, Yingxue Zhang 0001 Todyformer: Towards Holistic Dynamic Graph Transformers with Structure-Aware Tokenization. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Bo Yang, Chen Wang, Xiaoshuang Ma, Beiping Song, Zhuang Liu Zero-shot sketch-based remote sensing image retrieval based on multi-level and attention-guided tokenization. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Yang Jin, Zhicheng Sun 0001, Kun Xu, Kun Xu 0005, Liwei Chen, Hao Jiang, Quzhe Huang, Chengru Song, Yuliang Liu, Di Zhang, Yang Song, Kun Gai, Yadong Mu Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Mohamed Taher Alrefaie, Nour Eldin Morsy, Nada Samir Exploring Tokenization Strategies and Vocabulary Sizes for Enhanced Arabic Language Models. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Craig W. Schmidt, Varshini Reddy, Haoran Zhang, Alec Alameddine, Omri Uzan, Yuval Pinter, Chris Tanner Tokenization Is More Than Compression. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Ritwik Gupta, Shufan Li, Tyler Zhu, Jitendra Malik, Trevor Darrell, Karttikeya Mangalam xT: Nested Tokenization for Larger Context in Large Images. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Marco Cognetta, Vilém Zouhar, Sangwhan Moon, Naoaki Okazaki Two Counterexamples to Tokenization and the Noiseless Channel. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Delong Chen, Samuel Cahyawijaya, Jianfeng Liu, Baoyuan Wang, Pascale Fung Subobject-level Image Tokenization. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Leonidas Gee, Leonardo Rigutini, Marco Ernandes, Andrea Zugarini Multi-Word Tokenization for Sequence Compression. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Yingbo Ma, Suraj Kolla, Dhruv Kaliraman, Victoria Nolan, Zhenhong Hu, Ziyuan Guan, Yuanfang Ren, Brooke Armfield, Tezcan Ozrazgat-Baslanti, Tyler J. Loftus, Parisa Rashidi, Azra Bihorac, Benjamin Shickel Temporal Cross-Attention for Dynamic Embedding and Tokenization of Multimodal Electronic Health Records. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Yanis Labrak, Adrien Bazoge, Béatrice Daille, Mickael Rouvier, Richard Dufour How Important Is Tokenization in French Medical Masked Language Models? Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Geeling Chau, Yujin An, Ahamed Raffey Iqbal, Soon-Jo Chung, Yisong Yue, Sabera Talukder Generalizability Under Sensor Failure: Tokenization + Transformers Enable More Robust Latent Spaces. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Catherine Arnett, Pamela D. Rivière, Tyler A. Chang, Sean Trott Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement. Search on Bibsonomy CoRR The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Serghiy Obushnyi, Denys Virovets, Maksym Zhytar, Yuliia Zhdanova Unlocking the Economic Potential of Real Estate Tokenization in Ukraine. Search on Bibsonomy DECaT The full citation details ... 2024 DBLP  BibTeX  RDF
18Rob Goot Where are we Still Split on Tokenization? Search on Bibsonomy EACL (Findings) The full citation details ... 2024 DBLP  BibTeX  RDF
18Venkata Krishna Paanchajanya Kuppa, Rajendra Hegadi, Karthik Sajjan, Karusala Deepak Chowdary, Manoj Sahit Reddy Vanga, Poonam Kumar Investopolis: Decentralized Customer Loyalty Tokenization on the Blockchain. Search on Bibsonomy ICCE The full citation details ... 2024 DBLP  DOI  BibTeX  RDF
18Cagri Toraman, Eyup Halit Yilmaz, Furkan Sahinuç, Oguzhan Ozcelik Impact of Tokenization on Language Models: An Analysis for Turkish. Search on Bibsonomy ACM Trans. Asian Low Resour. Lang. Inf. Process. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Asma Mekki, Inès Zribi, Mariem Ellouze, Lamia Hadrich Belguith Tokenization of Tunisian Arabic: A Comparison between Three Machine Learning Models. Search on Bibsonomy ACM Trans. Asian Low Resour. Lang. Inf. Process. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Zhenyu Xu, Huiqiang Hu, Tingting Wang, Yuping Zhao, Cong Zhou, Huaxing Xu, Xiaobo Mao Identification of growth years of Kudzu root by hyperspectral imaging combined with spectral-spatial feature tokenization transformer. Search on Bibsonomy Comput. Electron. Agric. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Minseo Kang, Byunghan Lee TiCTok: Time-Series Anomaly Detection With Contrastive Tokenization. Search on Bibsonomy IEEE Access The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Ali Erkan, Tunga Güngör Analysis of Deep Learning Model Combinations and Tokenization Approaches in Sentiment Classification. Search on Bibsonomy IEEE Access The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Yifei Xu, Yixuan Xie, Bicheng Li, Chuanqi Xie, Yongchuan Zhang, Aichen Wang, Li Zhu Spatial-Spectral 1DSwin Transformer With Groupwise Feature Tokenization for Hyperspectral Image Classification. Search on Bibsonomy IEEE Trans. Geosci. Remote. Sens. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Lina Gao, Bing Liu 0022, Ping Fu, Mingzhu Xu Adaptive Spatial Tokenization Transformer for Salient Object Detection in Optical Remote Sensing Images. Search on Bibsonomy IEEE Trans. Geosci. Remote. Sens. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Arjun Rachana Harish, Xinlai Liu, Ming Li 0055, Ray Y. Zhong, George Q. Huang Blockchain-enabled digital assets tokenization for cyber-physical traceability in E-commerce logistics financing. Search on Bibsonomy Comput. Ind. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Zhihong Lei, Ernest Pusateri, Shiyi Han, Leo Liu, Mingbin Xu, Tim Ng, Ruchir Travadi, Youyuan Zhang, Mirko Hannemann, Man-Hung Siu, Zhen Huang 0001 Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Tomer Wullach, Shlomo E. Chazan Optimized Tokenization for Transcribed Error Correction. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Vardaan Pahuja, A. J. Piergiovanni, Anelia Angelova Diversifying Joint Vision-Language Tokenization Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Lisa Beinborn, Yuval Pinter Analyzing Cognitive Plausibility of Subword Tokenization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Tatsuya Hiraoka, Tomoya Iwakura Tokenization Tractability for Human and Machine Learning Model: An Annotation Study. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Nathan Fradet, Jean-Pierre Briot, Fabien Chhel, Amal El Fallah Seghrouchni, Nicolas Gutowski miditok: A Python package for MIDI file tokenization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Tomasz Limisiewicz, Jirí Balhar, David Marecek Tokenization Impacts Multilingual Language Modeling: Assessing Vocabulary Allocation and Overlap Across Languages. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Jakob Drachmann Havtorn, Amelie Royer, Tijmen Blankevoort, Babak Ehteshami Bejnordi MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Avijit Thawani, Saurabh Ghanekar, Xiaoyuan Zhu, Jay Pujara Learn Your Tokens: Word-Pooled Tokenization for Language Modeling. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Martin Berglund, Brink van der Merwe Formalizing BPE Tokenization. Search on Bibsonomy NCMA The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Tae-Hee Jeon, Bongseok Yang, Changhwan Kim, Yoonseob Lim Improving Korean NLP Tasks with Linguistically Informed Subword Tokenization and Sub-character Decomposition. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Siyang Liu 0003, Naihao Deng, Sahand Sabour, Yilin Jia, Minlie Huang, Rada Mihalcea Enhancing Long-form Text Generation Efficacy with Task-adaptive Tokenization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Xinfa Zhu, Yuanjun Lv, Yi Lei, Tao Li, Wendi He, Hongbin Zhou, Heng Lu 0004, Lei Xie 0001 Vec-Tok Speech: speech vectorization and tokenization for neural speech generation. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Sid Wang, Akshat Shrivastava, Sasha Livshits TreePiece: Faster Semantic Parsing via Tree Tokenization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Yang Jin, Kun Xu, Kun Xu 0005, Liwei Chen, Chao Liao, Jianchao Tan, Quzhe Huang, Bin Chen, Chenyi Lei, An Liu, Chengru Song, Xiaoqiang Lei, Di Zhang, Wenwu Ou, Kun Gai, Yadong Mu Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18David Yunis, Justin Jung, Falcon Z. Dai, Matthew R. Walter Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Adarsh Kumar, Pedro Sarmento From Words to Music: A Study of Subword Tokenization Techniques in Symbolic Music Generation. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Anton Kolonin Self-tuning hyper-parameters for unsupervised cross-lingual tokenization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Yang Tan, Mingchen Li, Pan Tan, Ziyi Zhou, Huiqun Yu, Guisheng Fan, Liang Hong PETA: Evaluating the Impact of Protein Transfer Learning with Sub-word Tokenization on Downstream Applications. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Harsh Chaudhari, Anuja Patil, Dhanashree Lavekar, Pranav Khairnar, Raviraj Joshi, Sachin Pande On Significance of Subword tokenization for Low Resource and Efficient Named Entity Recognition: A case study in Marathi. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Vilém Zouhar, Clara Meister, Juan Luis Gastaldi, Li Du, Mrinmaya Sachan, Ryan Cotterell Tokenization and the Noiseless Channel. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Bar Iluz, Tomasz Limisiewicz, Gabriel Stanovsky, David Marecek Exploring the Impact of Training Data Distribution and Subword Tokenization on Gender Bias in Machine Translation. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Weichao Zhao, Hezhen Hu, Wengang Zhou, Jiaxin Shi, Houqiang Li BEST: BERT Pre-Training for Sign Language Recognition with Coupling Tokenization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Benoist Wolleb, Romain Silvestri, Giorgos Vernikos, Ljiljana Dolamic, Andrei Popescu-Belis Assessing the Importance of Frequency versus Compositionality for Subword-based Tokenization in NMT. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Bernal Jiménez Gutiérrez, Huan Sun 0001, Yu Su 0001 Biomedical Language Models are Robust to Sub-optimal Tokenization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Christos Theodoropoulos 0001, Marie-Francine Moens An Information Extraction Study: Take In Mind the Tokenization! Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Zhichao Huang, Chutong Meng, Tom Ko RepCodec: A Speech Representation Codec for Speech Tokenization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Orevaoghene Ahia, Sachin Kumar 0009, Hila Gonen, Jungo Kasai, David R. Mortensen, Noah A. Smith, Yulia Tsvetkov Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Maksym Lysak, Ahmed S. Nassar, Nikolaos Livathinos, Christoph Auer, Peter W. J. Staar Optimized Table Tokenization for Table Structure Recognition. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Haozhe An, Rachel Rudinger Nichelle and Nancy: The Influence of Demographic Attributes and Tokenization Length on First Name Biases. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Tomer Ronen, Omer Levy, Avram Golbert Vision Transformers with Mixed-Resolution Tokenization. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Xuguang Ai, Ramakanth Kavuluru End-to-End Models for Chemical-Protein Interaction Extraction: Better Tokenization and Span-Based Pipeline Strategies. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Verena Blaschke, Hinrich Schütze, Barbara Plank Does Manipulating Tokenization Aid Cross-Lingual Transfer? A Study on POS Tagging for Non-Standardized Languages. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Mikhail Tikhomirov, Daniil Chernyshev Impact of Tokenization on LLaMa Russian Adaptation. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18David Samuel, Lilja Øvrelid Tokenization with Factorized Subword Encoding. Search on Bibsonomy CoRR The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Luca Gagliardelli, Luca Zecchini, Luca Ferretti, Domenico Beneventano, Giovanni Simonini, Sonia Bergamaschi, Mirko Orsini, Luca Magnotta, Emma Mescoli, Andrea Livaldi, Nicola Gessa, Piero De Sabbata, Gianluca D'Agosta, Fabrizio Paolucci, Fabio Moretti A big data platform exploiting auditable tokenization to promote good practices inside local energy communities. Search on Bibsonomy Future Gener. Comput. Syst. The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
18Yan Zhuang, Chi-Ren Shyu, Shenda Hong, Pengfei Li, Luxia Zhang Self-sovereign identity empowered non-fungible patient tokenization for health information exchange using blockchain technology. Search on Bibsonomy Comput. Biol. Medicine The full citation details ... 2023 DBLP  DOI  BibTeX  RDF
Displaying result #1 - #100 of 385 (100 per page; Change: )
Pages: [1][2][3][4][>>]
Valid XHTML 1.1! Valid CSS! [Valid RSS]
Maintained by L3S.
Previously maintained by Jörg Diederich.
Based upon DBLP by Michael Ley.
open data data released under the ODC-BY 1.0 license