|
|
Venues (Conferences, Journals, ...)
|
|
GrowBag graphs for keyword ? (Num. hits/coverage)
Group by:
The graphs summarize 105 occurrences of 73 keywords
|
|
|
Results
Found 385 publication records. Showing 385 according to the selection in the facets
Hits ?▲ |
Authors |
Title |
Venue |
Year |
Link |
Author keywords |
143 | Jing Jiang 0007, ChengXiang Zhai |
An empirical study of tokenization strategies for biomedical information retrieval. |
Inf. Retr. |
2007 |
DBLP DOI BibTeX RDF |
Stop word, Tokenization, Stemming, Biomedical information retrieval |
109 | Dolf Trieschnigg, Wessel Kraaij, Franciska de Jong |
The influence of basic tokenization on biomedical document retrieval. |
SIGIR |
2007 |
DBLP DOI BibTeX RDF |
biomedical document retrieval, tokenization, lexical analysis |
75 | Rong Tong, Bin Ma 0001, Kong-Aik Lee, Changhuai You, Donglai Zhu, Tomi Kinnunen, Hanwu Sun, Minghui Dong, Chng Eng Siong, Haizhou Li 0001 |
Fusion of Acoustic and Tokenization Features for Speaker Recognition. |
ISCSLP |
2006 |
DBLP DOI BibTeX RDF |
cepstral feature, phonotactic feature, support vector machine, Gaussian mixture model, fusion, tokenization, speaker recognition |
75 | Thomas W. Reps |
"Maximal-munch" Tokenization in Linear Time. |
ACM Trans. Program. Lang. Syst. |
1998 |
DBLP DOI BibTeX RDF |
tabulation, dynamic programming, backtracking, tokenization, memoization |
73 | Eiman Tamah Al-Shammari, Jessica Lin 0001 |
A novel Arabic lemmatization algorithm. |
AND |
2008 |
DBLP DOI BibTeX RDF |
text mining, tokenization, stemming, Arabic, lemmatization |
56 | Paul McNamee, Charles K. Nicholas, James Mayfield |
Addressing morphological variation in alphabetic languages. |
SIGIR |
2009 |
DBLP DOI BibTeX RDF |
morphology, tokenization, stemming, CLIR, character n-grams |
52 | Paul McNamee, James Mayfield |
JHU/APL Experiments in Tokenization and Non-word Translation. |
CLEF |
2003 |
DBLP DOI BibTeX RDF |
|
51 | Daniele Paolo Scarpazza, Gregory F. Russell |
High-performance regular expression scanning on the Cell/B.E. processor. |
ICS |
2009 |
DBLP DOI BibTeX RDF |
multi-core, regular expressions, cell processor |
51 | Georgiana Ifrim, Gökhan H. Bakir, Gerhard Weikum |
Fast logistic regression for text categorization with variable-length n-grams. |
KDD |
2008 |
DBLP DOI BibTeX RDF |
fast logistic regression, n-gram learning, variable-length n-grams, text categorization |
41 | Doan Nguyen |
Query preprocessing: improving web search through a Vietnamese word tokenization approach. |
SIGIR |
2008 |
DBLP DOI BibTeX RDF |
Vietnamese word tokenization, query preprocessing, search |
39 | Daniel P. Lopresti |
Optical character recognition errors and their effects on natural language processing. |
AND |
2008 |
DBLP DOI BibTeX RDF |
performance evaluation, optical character recognition, tokenization, part-of-speech tagging, sentence boundary detection |
39 | Benoît Sagot |
Building a Morphosyntactic Lexicon and a Pre-syntactic Processing Chain for Polish. |
LTC |
2007 |
DBLP DOI BibTeX RDF |
Morphosyntactic lexicon, pre-syntactic processing, named entites recognition, Polish language, tokenization, spelling correction |
39 | Paul McNamee |
JHU/APL Ad Hoc Experiments at CLEF 2006. |
CLEF |
2006 |
DBLP DOI BibTeX RDF |
character n-gram tokenization, corpus-based translation, Cross-language information retrieval |
39 | John F. Pitrelli, Amit Roy |
Creating word-level language models for large-vocabulary handwriting recognition. |
Int. J. Document Anal. Recognit. |
2003 |
DBLP DOI BibTeX RDF |
Unigram, Search, Language modeling, Handwriting recognition, Tokenization |
39 | James Mayfield, Paul McNamee |
Single n-gram stemming. |
SIGIR |
2003 |
DBLP DOI BibTeX RDF |
n-gram tokenization, stemming |
37 | Aaditya K. Singh, DJ Strouse |
Tokenization counts: the impact of tokenization on arithmetic in frontier LLMs. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
37 | Anaelia Ovalle, Ninareh Mehrabi, Palash Goyal, Jwala Dhamala, Kai-Wei Chang, Richard S. Zemel, Aram Galstyan, Rahul Gupta 0001 |
Are you talking to ['xem'] or ['x', 'em']? On Tokenization and Addressing Misgendering in LLMs with Pronoun Tokenization Parity. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
37 | Murhaf Fares, Stephan Oepen, Yi Zhang 0003 |
Machine Learning for High-Quality Tokenization Replicating Variable Tokenization Schemes. |
CICLing (1) |
2013 |
DBLP DOI BibTeX RDF |
|
35 | Hamid Abdul Basit, Simon J. Puglisi, William F. Smyth, Andrew Turpin, Stan Jarzabek |
Efficient token based clone detection with flexible tokenization. |
ESEC/SIGSOFT FSE |
2007 |
DBLP DOI BibTeX RDF |
token-based clone detection, reverse engineering, software maintenance, clone detection |
35 | Jaap Kamps, Sisay Fissaha Adafre, Maarten de Rijke |
Effective Translation, Tokenization and Combination for Cross-Lingual Retrieval. |
CLEF |
2004 |
DBLP DOI BibTeX RDF |
|
35 | António Horta Branco, João Ricardo Silva |
Contractions: Breaking the Tokenization-Tagging Circularity. |
PROPOR |
2003 |
DBLP DOI BibTeX RDF |
|
35 | Jorge Graña, Francisco-Mario Barcala, Jesús Vilares Ferro |
Formal Methods of Tokenization for Part-of-Speech Tagging. |
CICLing |
2002 |
DBLP DOI BibTeX RDF |
|
34 | John F. Pitrelli, Amit Roy |
Creating Word-Level Language Models for Handwriting Recognition. |
ICDAR |
2001 |
DBLP DOI BibTeX RDF |
|
22 | Eiman Tamah Al-Shammari, Jessica Lin 0001 |
Towards an error-free Arabic stemming. |
CIKM-iNEWS |
2008 |
DBLP DOI BibTeX RDF |
text mining, tokenization, stemming, Arabic, lemmatization |
22 | Choochart Haruechaiyasak, Sarawoot Kongyoung, Chaianun Damrongrat |
LearnLexTo: a machine-learning based word segmentation for indexing Thai texts. |
CIKM-iNEWS |
2008 |
DBLP DOI BibTeX RDF |
indexing, tokenization, word segmentation |
22 | George Forman, Evan Kirshenbaum |
Extremely fast text feature extraction for classification and indexing. |
CIKM |
2008 |
DBLP DOI BibTeX RDF |
feature engineering, text tokenization, feature extraction, text mining, text indexing, document categorization, bag-of-words |
22 | Paul McNamee, Charles K. Nicholas, James Mayfield |
Don't have a stemmer?: be un+concern+ed. |
SIGIR |
2008 |
DBLP DOI BibTeX RDF |
unsupervised morphological segmentation, tokenization, stemming, character n-grams |
22 | Dan Tufis, Dan Stefanescu, Radu Ion, Alexandru Ceausu |
RACAI's Question Answering System at QA@CLEF2007. |
CLEF |
2007 |
DBLP DOI BibTeX RDF |
tagging, question answering, tokenization, query generation, lemmatization, answer extraction, Lucene, indexing and retrieval |
22 | Daniel P. Lopresti |
Performance evaluation for text processing of noisy inputs. |
SAC |
2005 |
DBLP DOI BibTeX RDF |
performance evaluation, optical character recognition, tokenization, part-of-speech tagging, sentence boundary detection |
22 | Paul McNamee, James Mayfield |
Translating pieces of words. |
SIGIR |
2005 |
DBLP DOI BibTeX RDF |
character n-gram tokenization, translation, cross-language information retrieval, parallel corpora |
22 | Robert C. Goldstein, Christian Wagner 0001 |
Database Management with Sequence Trees and Tokens. |
IEEE Trans. Knowl. Data Eng. |
1997 |
DBLP DOI BibTeX RDF |
performance, design, Abstract data types, database management, tokenization, file organization |
18 | Taewan Kim, Jinwoo Kim, Heeseok Oh, Jiwoo Kang |
Deep Transformer Based Video Inpainting Using Fast Fourier Tokenization. |
IEEE Access |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Yigit Bekir Kaya, A. Cüneyd Tantug |
Effect of tokenization granularity for Turkish large language models. |
Intell. Syst. Appl. |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Pierre Poitier, Jérôme Fink, Benoît Frénay |
Towards better transition modeling in recurrent neural networks: The case of sign language tokenization. |
Neurocomputing |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Edo Dotan, Gal Jaschek, Tal Pupko, Yonatan Belinkov |
Effect of tokenization on transformers for biological sequences. |
Bioinform. |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Jinbiao Yang |
Rethinking Tokenization: Crafting Better Tokenizers for Large Language Models. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Ninglu Shao, Shitao Xiao, Zheng Liu, Peitian Zhang |
Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Qijiong Liu, Hengchang Hu, Jiahao Wu 0004, Jieming Zhu, Min-Yen Kan, Xiao-Ming Wu 0003 |
Discrete Semantic Tokenization for Deep CTR Prediction. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Omer Goldman, Avi Caciularu, Matan Eyal, Kris Cao, Idan Szpektor, Reut Tsarfaty |
Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Mahdi Biparva, Raika Karimi, Faezeh Faez, Yingxue Zhang 0001 |
Todyformer: Towards Holistic Dynamic Graph Transformers with Structure-Aware Tokenization. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Bo Yang, Chen Wang, Xiaoshuang Ma, Beiping Song, Zhuang Liu |
Zero-shot sketch-based remote sensing image retrieval based on multi-level and attention-guided tokenization. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Yang Jin, Zhicheng Sun 0001, Kun Xu, Kun Xu 0005, Liwei Chen, Hao Jiang, Quzhe Huang, Chengru Song, Yuliang Liu, Di Zhang, Yang Song, Kun Gai, Yadong Mu |
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Mohamed Taher Alrefaie, Nour Eldin Morsy, Nada Samir |
Exploring Tokenization Strategies and Vocabulary Sizes for Enhanced Arabic Language Models. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Craig W. Schmidt, Varshini Reddy, Haoran Zhang, Alec Alameddine, Omri Uzan, Yuval Pinter, Chris Tanner |
Tokenization Is More Than Compression. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Ritwik Gupta, Shufan Li, Tyler Zhu, Jitendra Malik, Trevor Darrell, Karttikeya Mangalam |
xT: Nested Tokenization for Larger Context in Large Images. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Marco Cognetta, Vilém Zouhar, Sangwhan Moon, Naoaki Okazaki |
Two Counterexamples to Tokenization and the Noiseless Channel. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Delong Chen, Samuel Cahyawijaya, Jianfeng Liu, Baoyuan Wang, Pascale Fung |
Subobject-level Image Tokenization. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Leonidas Gee, Leonardo Rigutini, Marco Ernandes, Andrea Zugarini |
Multi-Word Tokenization for Sequence Compression. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Yingbo Ma, Suraj Kolla, Dhruv Kaliraman, Victoria Nolan, Zhenhong Hu, Ziyuan Guan, Yuanfang Ren, Brooke Armfield, Tezcan Ozrazgat-Baslanti, Tyler J. Loftus, Parisa Rashidi, Azra Bihorac, Benjamin Shickel |
Temporal Cross-Attention for Dynamic Embedding and Tokenization of Multimodal Electronic Health Records. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Yanis Labrak, Adrien Bazoge, Béatrice Daille, Mickael Rouvier, Richard Dufour |
How Important Is Tokenization in French Medical Masked Language Models? |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Geeling Chau, Yujin An, Ahamed Raffey Iqbal, Soon-Jo Chung, Yisong Yue, Sabera Talukder |
Generalizability Under Sensor Failure: Tokenization + Transformers Enable More Robust Latent Spaces. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Catherine Arnett, Pamela D. Rivière, Tyler A. Chang, Sean Trott |
Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Serghiy Obushnyi, Denys Virovets, Maksym Zhytar, Yuliia Zhdanova |
Unlocking the Economic Potential of Real Estate Tokenization in Ukraine. |
DECaT |
2024 |
DBLP BibTeX RDF |
|
18 | Rob Goot |
Where are we Still Split on Tokenization? |
EACL (Findings) |
2024 |
DBLP BibTeX RDF |
|
18 | Venkata Krishna Paanchajanya Kuppa, Rajendra Hegadi, Karthik Sajjan, Karusala Deepak Chowdary, Manoj Sahit Reddy Vanga, Poonam Kumar |
Investopolis: Decentralized Customer Loyalty Tokenization on the Blockchain. |
ICCE |
2024 |
DBLP DOI BibTeX RDF |
|
18 | Cagri Toraman, Eyup Halit Yilmaz, Furkan Sahinuç, Oguzhan Ozcelik |
Impact of Tokenization on Language Models: An Analysis for Turkish. |
ACM Trans. Asian Low Resour. Lang. Inf. Process. |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Asma Mekki, Inès Zribi, Mariem Ellouze, Lamia Hadrich Belguith |
Tokenization of Tunisian Arabic: A Comparison between Three Machine Learning Models. |
ACM Trans. Asian Low Resour. Lang. Inf. Process. |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Zhenyu Xu, Huiqiang Hu, Tingting Wang, Yuping Zhao, Cong Zhou, Huaxing Xu, Xiaobo Mao |
Identification of growth years of Kudzu root by hyperspectral imaging combined with spectral-spatial feature tokenization transformer. |
Comput. Electron. Agric. |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Minseo Kang, Byunghan Lee |
TiCTok: Time-Series Anomaly Detection With Contrastive Tokenization. |
IEEE Access |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Ali Erkan, Tunga Güngör |
Analysis of Deep Learning Model Combinations and Tokenization Approaches in Sentiment Classification. |
IEEE Access |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Yifei Xu, Yixuan Xie, Bicheng Li, Chuanqi Xie, Yongchuan Zhang, Aichen Wang, Li Zhu |
Spatial-Spectral 1DSwin Transformer With Groupwise Feature Tokenization for Hyperspectral Image Classification. |
IEEE Trans. Geosci. Remote. Sens. |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Lina Gao, Bing Liu 0022, Ping Fu, Mingzhu Xu |
Adaptive Spatial Tokenization Transformer for Salient Object Detection in Optical Remote Sensing Images. |
IEEE Trans. Geosci. Remote. Sens. |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Arjun Rachana Harish, Xinlai Liu, Ming Li 0055, Ray Y. Zhong, George Q. Huang |
Blockchain-enabled digital assets tokenization for cyber-physical traceability in E-commerce logistics financing. |
Comput. Ind. |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Zhihong Lei, Ernest Pusateri, Shiyi Han, Leo Liu, Mingbin Xu, Tim Ng, Ruchir Travadi, Youyuan Zhang, Mirko Hannemann, Man-Hung Siu, Zhen Huang 0001 |
Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Tomer Wullach, Shlomo E. Chazan |
Optimized Tokenization for Transcribed Error Correction. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Vardaan Pahuja, A. J. Piergiovanni, Anelia Angelova |
Diversifying Joint Vision-Language Tokenization Learning. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Lisa Beinborn, Yuval Pinter |
Analyzing Cognitive Plausibility of Subword Tokenization. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Tatsuya Hiraoka, Tomoya Iwakura |
Tokenization Tractability for Human and Machine Learning Model: An Annotation Study. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Nathan Fradet, Jean-Pierre Briot, Fabien Chhel, Amal El Fallah Seghrouchni, Nicolas Gutowski |
miditok: A Python package for MIDI file tokenization. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Tomasz Limisiewicz, Jirí Balhar, David Marecek |
Tokenization Impacts Multilingual Language Modeling: Assessing Vocabulary Allocation and Overlap Across Languages. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Jakob Drachmann Havtorn, Amelie Royer, Tijmen Blankevoort, Babak Ehteshami Bejnordi |
MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Avijit Thawani, Saurabh Ghanekar, Xiaoyuan Zhu, Jay Pujara |
Learn Your Tokens: Word-Pooled Tokenization for Language Modeling. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Martin Berglund, Brink van der Merwe |
Formalizing BPE Tokenization. |
NCMA |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Tae-Hee Jeon, Bongseok Yang, Changhwan Kim, Yoonseob Lim |
Improving Korean NLP Tasks with Linguistically Informed Subword Tokenization and Sub-character Decomposition. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Siyang Liu 0003, Naihao Deng, Sahand Sabour, Yilin Jia, Minlie Huang, Rada Mihalcea |
Enhancing Long-form Text Generation Efficacy with Task-adaptive Tokenization. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Xinfa Zhu, Yuanjun Lv, Yi Lei, Tao Li, Wendi He, Hongbin Zhou, Heng Lu 0004, Lei Xie 0001 |
Vec-Tok Speech: speech vectorization and tokenization for neural speech generation. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Sid Wang, Akshat Shrivastava, Sasha Livshits |
TreePiece: Faster Semantic Parsing via Tree Tokenization. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Yang Jin, Kun Xu, Kun Xu 0005, Liwei Chen, Chao Liao, Jianchao Tan, Quzhe Huang, Bin Chen, Chenyi Lei, An Liu, Chengru Song, Xiaoqiang Lei, Di Zhang, Wenwu Ou, Kun Gai, Yadong Mu |
Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | David Yunis, Justin Jung, Falcon Z. Dai, Matthew R. Walter |
Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Adarsh Kumar, Pedro Sarmento |
From Words to Music: A Study of Subword Tokenization Techniques in Symbolic Music Generation. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Anton Kolonin |
Self-tuning hyper-parameters for unsupervised cross-lingual tokenization. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Yang Tan, Mingchen Li, Pan Tan, Ziyi Zhou, Huiqun Yu, Guisheng Fan, Liang Hong |
PETA: Evaluating the Impact of Protein Transfer Learning with Sub-word Tokenization on Downstream Applications. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Harsh Chaudhari, Anuja Patil, Dhanashree Lavekar, Pranav Khairnar, Raviraj Joshi, Sachin Pande |
On Significance of Subword tokenization for Low Resource and Efficient Named Entity Recognition: A case study in Marathi. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Vilém Zouhar, Clara Meister, Juan Luis Gastaldi, Li Du, Mrinmaya Sachan, Ryan Cotterell |
Tokenization and the Noiseless Channel. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Bar Iluz, Tomasz Limisiewicz, Gabriel Stanovsky, David Marecek |
Exploring the Impact of Training Data Distribution and Subword Tokenization on Gender Bias in Machine Translation. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Weichao Zhao, Hezhen Hu, Wengang Zhou, Jiaxin Shi, Houqiang Li |
BEST: BERT Pre-Training for Sign Language Recognition with Coupling Tokenization. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Benoist Wolleb, Romain Silvestri, Giorgos Vernikos, Ljiljana Dolamic, Andrei Popescu-Belis |
Assessing the Importance of Frequency versus Compositionality for Subword-based Tokenization in NMT. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Bernal Jiménez Gutiérrez, Huan Sun 0001, Yu Su 0001 |
Biomedical Language Models are Robust to Sub-optimal Tokenization. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Christos Theodoropoulos 0001, Marie-Francine Moens |
An Information Extraction Study: Take In Mind the Tokenization! |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Zhichao Huang, Chutong Meng, Tom Ko |
RepCodec: A Speech Representation Codec for Speech Tokenization. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Orevaoghene Ahia, Sachin Kumar 0009, Hila Gonen, Jungo Kasai, David R. Mortensen, Noah A. Smith, Yulia Tsvetkov |
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Maksym Lysak, Ahmed S. Nassar, Nikolaos Livathinos, Christoph Auer, Peter W. J. Staar |
Optimized Table Tokenization for Table Structure Recognition. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Haozhe An, Rachel Rudinger |
Nichelle and Nancy: The Influence of Demographic Attributes and Tokenization Length on First Name Biases. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Tomer Ronen, Omer Levy, Avram Golbert |
Vision Transformers with Mixed-Resolution Tokenization. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Xuguang Ai, Ramakanth Kavuluru |
End-to-End Models for Chemical-Protein Interaction Extraction: Better Tokenization and Span-Based Pipeline Strategies. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Verena Blaschke, Hinrich Schütze, Barbara Plank |
Does Manipulating Tokenization Aid Cross-Lingual Transfer? A Study on POS Tagging for Non-Standardized Languages. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Mikhail Tikhomirov, Daniil Chernyshev |
Impact of Tokenization on LLaMa Russian Adaptation. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | David Samuel, Lilja Øvrelid |
Tokenization with Factorized Subword Encoding. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Luca Gagliardelli, Luca Zecchini, Luca Ferretti, Domenico Beneventano, Giovanni Simonini, Sonia Bergamaschi, Mirko Orsini, Luca Magnotta, Emma Mescoli, Andrea Livaldi, Nicola Gessa, Piero De Sabbata, Gianluca D'Agosta, Fabrizio Paolucci, Fabio Moretti |
A big data platform exploiting auditable tokenization to promote good practices inside local energy communities. |
Future Gener. Comput. Syst. |
2023 |
DBLP DOI BibTeX RDF |
|
18 | Yan Zhuang, Chi-Ren Shyu, Shenda Hong, Pengfei Li, Luxia Zhang |
Self-sovereign identity empowered non-fungible patient tokenization for health information exchange using blockchain technology. |
Comput. Biol. Medicine |
2023 |
DBLP DOI BibTeX RDF |
|
Displaying result #1 - #100 of 385 (100 per page; Change: ) Pages: [ 1][ 2][ 3][ 4][ >>] |
|