Hasil "cs.CL" - JURNALIN

arXiv Open Access 2025

ClonEval: An Open Voice Cloning Benchmark

Iwona Christop, Tomasz Kuczyński, Marek Kubis

We present a novel benchmark for voice cloning text-to-speech models. The benchmark consists of an evaluation protocol, an open-source library for assessing the performance of voice cloning models, and an accompanying leaderboard. The paper discusses design considerations and presents a detailed description of the evaluation procedure. The usage of the software library is explained, along with the organization of results on the leaderboard.

en cs.CL

Detail Sumber

CrossRef Open Access 2023

First-principles predictions of enhanced thermoelectric properties for Cs2SnI2Cl2 and Cs2PbI2Cl2 monolayers with spin–orbit coupling

Jiajia Fei, Xiaojiao Zhang, Jialin Li et al.

Abstract Inspired by the exceptional charge transport properties and ultra-low thermal conductivity of halide perovskite, we investigate the electronic nature, thermal transport, and thermoelectric properties for Ruddlesden–Popper all-inorganic perovskite, Cs2SnI2Cl2 and Cs2PbI2Cl2 monolayers, using first-principles calculations. During the calculations, spin–orbit coupling has been considered for electronic transport as well as thermoelectric properties. The results show that the Cs2SnI2Cl2 and Cs2PbI2Cl2 monolayers exhibit high carrier mobility and low thermal conductivity. Stronger phonon–phonon interaction is responsible for the fact that thermal conductivity of Cs2SnI2Cl2 monolayer is much lower than that of Cs2PbI2Cl2 monolayer. At 700 K, the values of the figure of merit (ZT) for the n-type doped Cs2SnI2Cl2 and Cs2PbI2Cl2 monolayers are about 1.05 and 0.32 at the optimized carrier concentrations 5.42 × 1012 cm−2 and 9.84 × 1012 cm−2. Moreover, when spin–orbit coupling is considered, the corresponding ZT values are enhanced to 2.73 and 1.98 at 5.27 × 1011 cm−2 and 6.16 × 1011 cm−2. These results signify that Cs2SnI2Cl2 and Cs2PbI2Cl2 monolayers are promising thermoelectric candidates.

4 sitasi en

Detail DOI Sumber

arXiv Open Access 2023

SIGMORPHON 2023 Shared Task of Interlinear Glossing: Baseline Model

Michael Ginn

Language documentation is a critical aspect of language preservation, often including the creation of Interlinear Glossed Text (IGT). Creating IGT is time-consuming and tedious, and automating the process can save valuable annotator effort. This paper describes the baseline system for the SIGMORPHON 2023 Shared Task of Interlinear Glossing. In our system, we utilize a transformer architecture and treat gloss generation as a sequence labelling task.

en cs.CL

Detail Sumber

arXiv Open Access 2022

A dynamic programming algorithm for span-based nested named-entity recognition in O(n^2)

Caio Corro

Span-based nested named-entity recognition (NER) has a cubic-time complexity using a variant of the CYK algorithm. We show that by adding a supplementary structural constraint on the search space, nested NER has a quadratic-time complexity, that is the same asymptotic complexity than the non-nested case. The proposed algorithm covers a large part of three standard English benchmarks and delivers comparable experimental results.

en cs.CL

Detail Sumber

CrossRef Open Access 2021

Successfully conducting an objective structured clinical examination with real patients during the COVID-19 pandemic

CH Lee, Pauline Y Ng, Shirley YY Pang et al.

3 sitasi en

Detail DOI Sumber

arXiv Open Access 2021

Fixing exposure bias with imitation learning needs powerful oracles

Luca Hormann, Artem Sokolov

We apply imitation learning (IL) to tackle the NMT exposure bias problem with error-correcting oracles, and evaluate an SMT lattice-based oracle which, despite its excellent performance in an unconstrained oracle translation task, turned out to be too pruned and idiosyncratic to serve as the oracle for IL.

en cs.CL, cs.LG

Detail Sumber

arXiv Open Access 2020

A Survey of Neural Networks and Formal Languages

Joshua Ackerman, George Cybenko

This report is a survey of the relationships between various state-of-the-art neural network architectures and formal languages as, for example, structured by the Chomsky Language Hierarchy. Of particular interest are the abilities of a neural architecture to represent, recognize and generate words from a specific language by learning from samples of the language.

en cs.CL

Detail Sumber

arXiv Open Access 2019

Fine-tune BERT for Extractive Summarization

Yang Liu

BERT, a pre-trained Transformer model, has achieved ground-breaking performance on multiple NLP tasks. In this paper, we describe BERTSUM, a simple variant of BERT, for extractive summarization. Our system is the state of the art on the CNN/Dailymail dataset, outperforming the previous best-performed system by 1.65 on ROUGE-L. The codes to reproduce our results are available at https://github.com/nlpyang/BertSum

en cs.CL

Detail Sumber

arXiv Open Access 2018

The Evolution of Popularity and Images of Characters in Marvel Cinematic Universe Fanfictions

Fan Bu

This analysis proposes a new topic model to study the yearly trends in Marvel Cinematic Universe fanfictions on three levels: character popularity, character images/topics, and vocabulary pattern of topics. It is found that character appearances in fanfictions have become more diverse over the years thanks to constant introduction of new characters in feature films, and in the case of Captain America, multi-dimensional character development is well-received by the fanfiction world.

en cs.CL

Detail Sumber

arXiv Open Access 2018

Extrapolation in NLP

Jeff Mitchell, Pasquale Minervini, Pontus Stenetorp et al.

We argue that extrapolation to examples outside the training space will often be easier for models that capture global structures, rather than just maximise their local fit to the training data. We show that this is true for two popular models: the Decomposable Attention Model and word2vec.

en cs.CL

Detail Sumber

arXiv Open Access 2018

A case for deep learning in semantics

Christopher Potts

Pater's target article builds a persuasive case for establishing stronger ties between theoretical linguistics and connectionism (deep learning). This commentary extends his arguments to semantics, focusing in particular on issues of learning, compositionality, and lexical meaning.

en cs.CL

Detail Sumber

CrossRef Open Access 2017

[DMAPTs] + Cl − : A Promising Versatile Regioselective Tosyl Transfer Reagent

Chiranjeevi Donthulachitti, Sridhar Reddy Kothakapu, Ravi Kumar Shekunti et al.

Abstract Regioselective monotosylation, for the first time on 2,4‐ syn & 2,4‐ anti configured ene‐tetrols and on diols & triols was achieved in excellent yields and selectivity by employing [DMAPTs] + Cl − , 1 . We also proposed and developed operational flexible preferential regio‐complimentary tosylation of cis ‐1,2‐diol moiety of pyranosides by employing 1 . Further NMR correlation for unambiguous structure determination of regioselectivity and also preferred conformations of these pyranosides were established.

17 sitasi en

Detail DOI Sumber

arXiv Open Access 2017

Convolutional Attention-based Seq2Seq Neural Network for End-to-End ASR

Dan Lim

This thesis introduces the sequence to sequence model with Luong's attention mechanism for end-to-end ASR. It also describes various neural network algorithms including Batch normalization, Dropout and Residual network which constitute the convolutional attention-based seq2seq neural network. Finally the proposed model proved its effectiveness for speech recognition achieving 15.8% phoneme error rate on TIMIT dataset.

en cs.CL

Detail Sumber

arXiv Open Access 2017

Build Fast and Accurate Lemmatization for Arabic

Hamdy Mubarak

In this paper we describe the complexity of building a lemmatizer for Arabic which has a rich and complex derivational morphology, and we discuss the need for a fast and accurate lammatization to enhance Arabic Information Retrieval (IR) results. We also introduce a new data set that can be used to test lemmatization accuracy, and an efficient lemmatization algorithm that outperforms state-of-the-art Arabic lemmatization in terms of accuracy and speed. We share the data set and the code for public.

en cs.CL

Detail Sumber

arXiv Open Access 2017

Job Detection in Twitter

Besat Kassaie

In this report, we propose a new application for twitter data called \textit{job detection}. We identify people's job category based on their tweets. As a preliminary work, we limited our task to identify only IT workers from other job holders. We have used and compared both simple bag of words model and a document representation based on Skip-gram model. Our results show that the model based on Skip-gram, achieves a 76\% precision and 82\% recall.

en cs.CL

Detail Sumber

arXiv Open Access 2016

Synthetic Language Generation and Model Validation in BEAST2

Stuart Bradley

Generating synthetic languages aids in the testing and validation of future computational linguistic models and methods. This thesis extends the BEAST2 phylogenetic framework to add linguistic sequence generation under multiple models. The new plugin is then used to test the effects of the phenomena of word borrowing on the inference process under two widely used phylolinguistic models.

en cs.CL

Detail Sumber

CrossRef Open Access 2015

Health-Related Quality of Life of People From Low-Income Families In Hong Kong, China

CL Lam, V Guo, CK Wong et al.

2 sitasi en

Detail DOI Sumber

CrossRef Open Access 2015

ChemInform Abstract: Supercubooctahedron (Cs6Cl)2Cs5 [Ga15Ge9Se48] Exhibiting Both Cation and Anion Exchange.

Shang‐Xiong Huang‐Fu, Jin‐Ni Shen, Hua Lin et al.

AbstractLight yellow, transparent single crystals of (V) are prepared by solid state reaction.

en

Detail DOI Sumber

arXiv Open Access 2011

Why is language well-designed for communication? (Commentary on Christiansen and Chater: 'Language as shaped by the brain')

Jean-Louis Dessalles

Selection through iterated learning explains no more than other non-functional accounts, such as universal grammar, why language is so well-designed for communicative efficiency. It does not predict several distinctive features of language like central embedding, large lexicons or the lack of iconicity, that seem to serve communication purposes at the expense of learnability.

en cs.CL, q-bio.NC

Detail Sumber

CrossRef Open Access 2009

ChemInform Abstract: A Complex Cesiumchloride: Cs5[AgCl2][CoCl4]Cl2.

Oliver Fastje, Angela Moeller

AbstractChemInform is a weekly Abstracting Service, delivering concise information at a glance that was extracted from about 200 leading journals. To access a ChemInform Abstract of an article which was published elsewhere, please select a “Full Text” option. The original article is trackable via the “References” option.

en

Detail DOI Sumber

Hasil untuk "cs.CL"