This list is closed, nobody may subscribe to it.
2008 |
Jan
|
Feb
(24) |
Mar
(1) |
Apr
(1) |
May
|
Jun
(1) |
Jul
|
Aug
(4) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2009 |
Jan
(1) |
Feb
|
Mar
(1) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
(1) |
Mar
(1) |
Apr
|
May
(2) |
Jun
|
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2011 |
Jan
(1) |
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
(1) |
Jul
(1) |
Aug
(1) |
Sep
(1) |
Oct
(1) |
Nov
(3) |
Dec
(1) |
2012 |
Jan
(1) |
Feb
(5) |
Mar
(3) |
Apr
(1) |
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2013 |
Jan
(2) |
Feb
(4) |
Mar
(1) |
Apr
(2) |
May
(3) |
Jun
(1) |
Jul
|
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(2) |
Dec
(4) |
2014 |
Jan
(6) |
Feb
(1) |
Mar
(3) |
Apr
|
May
(2) |
Jun
(1) |
Jul
(2) |
Aug
(6) |
Sep
(1) |
Oct
(9) |
Nov
(2) |
Dec
(2) |
2015 |
Jan
(4) |
Feb
(4) |
Mar
(14) |
Apr
(4) |
May
(1) |
Jun
(1) |
Jul
(1) |
Aug
|
Sep
|
Oct
(2) |
Nov
(3) |
Dec
|
2016 |
Jan
(3) |
Feb
(2) |
Mar
|
Apr
(3) |
May
(4) |
Jun
(1) |
Jul
(1) |
Aug
(2) |
Sep
(1) |
Oct
(3) |
Nov
(2) |
Dec
(2) |
2017 |
Jan
(4) |
Feb
|
Mar
(1) |
Apr
|
May
(7) |
Jun
(1) |
Jul
|
Aug
(1) |
Sep
(1) |
Oct
(1) |
Nov
(1) |
Dec
(3) |
2018 |
Jan
(7) |
Feb
(3) |
Mar
(5) |
Apr
(12) |
May
(1) |
Jun
(2) |
Jul
(2) |
Aug
(2) |
Sep
(1) |
Oct
|
Nov
(1) |
Dec
(2) |
2019 |
Jan
(3) |
Feb
(4) |
Mar
(4) |
Apr
(5) |
May
(2) |
Jun
(2) |
Jul
(2) |
Aug
|
Sep
(2) |
Oct
|
Nov
(1) |
Dec
(2) |
2020 |
Jan
(3) |
Feb
(4) |
Mar
(4) |
Apr
(2) |
May
|
Jun
(2) |
Jul
(2) |
Aug
|
Sep
(2) |
Oct
(3) |
Nov
(1) |
Dec
(2) |
2021 |
Jan
(6) |
Feb
(1) |
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Carlos R. <car...@li...> - 2021-03-08 18:49:41
|
Dear all, This list has been the main communication tool of the MWE community for 14 years or so. We will proceed to *shutting down this list in the next couple of days*. The archive should still be available, but the list will no longer be active. The new mailing list <https://groups.google.com/g/siglex-mwe-members> now has more than 240 members: thanks for having re-registered. If you have not subscribed yet, feel free to do it at any time to avoid missing the latest news. To do so, simply register as a SIGLEX member <https://siglex.org/members>, it's simple and free. All the best Carlos, on behalf of the SIGLEX-MWE Section On Thu, Feb 18, 2021 at 9:24 AM Carlos Ramisch <car...@li...> wrote: > Dear all, > > Please remember to re-register to SIGLEX <https://siglex.org/members.html> > and check the MWE Section checkbox, if you have not already done so. > The new mailing list <https://multiword.org/mailinglist> is already > active; this mailing list at sourceforge will be *shut down on March 1st.* > > Best regards, > Carlos > > On Thu, Jan 28, 2021 at 12:11 PM Carlos Ramisch <car...@li...> > wrote: > >> Dear all, >> >> We are happy to announce that the SIGLEX-MWE Section is upgrading its >> infrastructure. >> We have a new website <https://multiword.org/> and our members list was >> integrated with the SIGLEX members <https://siglex.org/members> >> directory. >> We have also created a new mailing list >> <https://multiword.org/mailinglist> for registered members in the SIGLEX >> directory. >> >> Therefore, we will be shortly *shutting down* this list >> mul...@li... >> We kindly ask you, if you have not already done so, to register as a >> SIGLEX member <https://siglex.org/members>. >> Do not forget to tick the "MWE" checkbox so that we can add you to the >> new MWE mailing list. >> >> Please, take action in the next 30 days. *The list will be definitely >> shut down on March 1, 2021*. >> If you have any trouble registering to the new directory and mailing >> list, just drop a line: >> sig...@go... >> >> Best >> Carlos, on behalf of the MWE Section's standing committee >> <https://multiword.org/organization/standingcommittee> >> > |
From: Carlos R. <car...@li...> - 2021-02-18 08:24:56
|
Dear all, Please remember to re-register to SIGLEX <https://siglex.org/members.html> and check the MWE Section checkbox, if you have not already done so. The new mailing list <https://multiword.org/mailinglist> is already active; this mailing list at sourceforge will be *shut down on March 1st.* Best regards, Carlos On Thu, Jan 28, 2021 at 12:11 PM Carlos Ramisch <car...@li...> wrote: > Dear all, > > We are happy to announce that the SIGLEX-MWE Section is upgrading its > infrastructure. > We have a new website <https://multiword.org/> and our members list was > integrated with the SIGLEX members <https://siglex.org/members> directory. > We have also created a new mailing list > <https://multiword.org/mailinglist> for registered members in the SIGLEX > directory. > > Therefore, we will be shortly *shutting down* this list > mul...@li... > We kindly ask you, if you have not already done so, to register as a > SIGLEX member <https://siglex.org/members>. > Do not forget to tick the "MWE" checkbox so that we can add you to the new > MWE mailing list. > > Please, take action in the next 30 days. *The list will be definitely > shut down on March 1, 2021*. > If you have any trouble registering to the new directory and mailing list, > just drop a line: > sig...@go... > > Best > Carlos, on behalf of the MWE Section's standing committee > <https://multiword.org/organization/standingcommittee> > |
From: Carlos R. <car...@li...> - 2021-01-28 11:12:02
|
Dear all, We are happy to announce that the SIGLEX-MWE Section is upgrading its infrastructure. We have a new website <https://multiword.org/> and our members list was integrated with the SIGLEX members <https://siglex.org/members> directory. We have also created a new mailing list <https://multiword.org/mailinglist> for registered members in the SIGLEX directory. Therefore, we will be shortly *shutting down* this list mul...@li... We kindly ask you, if you have not already done so, to register as a SIGLEX member <https://siglex.org/members>. Do not forget to tick the "MWE" checkbox so that we can add you to the new MWE mailing list. Please, take action in the next 30 days. *The list will be definitely shut down on March 1, 2021*. If you have any trouble registering to the new directory and mailing list, just drop a line: sig...@go... Best Carlos, on behalf of the MWE Section's standing committee <https://multiword.org/organization/standingcommittee> |
From: Shiva T. <sh....@gm...> - 2021-01-12 14:49:36
|
---------------------------------------- [apologies for any cross-posting] ---------------------------------------- *17th Workshop on Multiword Expressions (MWE 2021)* Colocated with ACL-IJCNLP 2021 (Bangkok, Thailand), 5 or 6 August 2021 *Deadline: April 19, 2021* https://multiword.org/mwe2021/ *Organised and sponsored by: SIGLEX, the Special Interest Group on the Lexicon of the ACL* *** FIRST CALL FOR PAPERS *** Multiword expressions (MWEs) are word combinations which exhibit lexical, syntactic, semantic, pragmatic and/or statistical idiosyncrasies (Baldwin & Kim 2010), such as by and large, hot dog, pay a visit and pull one's leg. The notion encompasses closely related phenomena: idioms, compounds, light-verb constructions, rhetorical figures, institutionalised phrases, collocations, etc. The behaviour of MWEs is often unpredictable, in particular their meanings are not regularly composed of the meanings of their parts. Thus, MWEs are a major challenge in computational linguistics (Constant et al. 2017), including linguistic modelling (e.g. treebanking), computational modelling (e.g. parsing), and end user NLP applications (e.g. natural language understanding, machine translation, and social media mining). Modelling and processing MWEs for NLP has been the topic of the MWE workshop organised by the MWE section <http://multiword.org/> of SIGLEX <https://siglex.org/> in conjunction with major NLP conferences since 2003. Although much progress has been made in the field, MWE processing in end-user NLP tasks is currently under-explored, and most studies still introduce MWEs as future work. Nonetheless, there are recent studies in which MWEs gained particular attention in end-user applications, including machine translation (Zaninello & Birch 2020), text simplification (Kochmar et al. 2020, Liu & Hwa 2016), language learning and assessment (Paquot et al. 2019, Christiansen & Arnon 2017), social media mining (Maisto et al. 2017), and abusive language detection (Zampieri et al. 2020, Caselli et al. 2020). The special focus for this 17th edition of the workshop is on *MWE processing in end-user applications* such as those listed above. On the one hand, the PARSEME shared tasks (Ramisch et al. 2020, Ramisch et al. 2018, Savary et al. 2017), among others, fostered significant progress in MWE identification, providing datasets, evaluation measures and tools that now allow fully integrating MWE identification into end-user applications. On the other hand, NLP seems to be shifting towards end-to-end neural models capable of solving complex end-user tasks with little or no intermediary linguistic symbols, questioning the extent to which MWEs should be implicitly or explicitly modelled. Therefore, one goal of this workshop is to bring together and encourage researchers in various NLP subfields to submit MWE-related research, so that approaches that deal with MWEs in various applications could benefit from each other. Following the success of previous joint workshops LAW-MWE-CxG 2018 <http://multiword.sourceforge.net/lawmwecxg2018/>, MWE-WN 2019 <http://multiword.sourceforge.net/mwewn2019/> and MWE-LEX 2020 <http://multiword.sourceforge.net/mwelex2020/>, we further extend the scope of the workshop to MWEs in e-lexicons and WordNets, MWE annotation, as well as grammatical constructions. The 17th Workshop on MWEs invites submissions on (but not limited to) the following topics: *Traditional MWE topics:* - Computationally-applicable theoretical work on MWEs and constructions in psycholinguistics and corpus linguistics - MWE and construction annotation and representation in resources such as corpora, treebanks, e-lexicons and WordNets - Processing of MWEs and constructions in syntactic and semantic frameworks (e.g. CCG, CxG, HPSG, LFG, TAG, UD, etc.) - Discovery and identification methods for MWEs and constructions - MWEs and constructions in language acquisition, language learning, and non-standard language (e.g. tweets, speech) - Evaluation of annotation and processing techniques for MWEs and constructions - Retrospective comparative analyses from the PARSEME shared tasks on automatic identification of MWEs *Topics on MWEs and end-user applications:* - Processing of MWEs and constructions in end-user applications (e.g. MT, NLU, summarisation, social media mining, computer assisted language learning) - Implicit and explicit representation of MWEs and constructions in end-user applications - Evaluation of end-user applications concerning MWEs and constructions - Resources and tools for MWEs and constructions (e.g. lexicons, identifiers) in end-user applications *** JOINT SESSION WITH WOAH WORKSHOP *** Pursuing the MWE Section's tradition of synergies with other communities and in accordance with ACL-IJCNLP 2021's theme track on NLP for social good, we will organise a joint session with the Workshop on Online Abuse and Harm (WOAH) <https://www.workshopononlineabuse.com/>. We believe that MWEs are important in online abuse detection, and that the latter can provide an interesting testbed for MWE processing technology. The main goal is to pave the way towards the creation of data for a shared task involving both communities. The format of the session is under discussion, and we welcome suggestions from the community. Submissions describing research on MWEs and abusive language, especially introducing new datasets, are also welcome. *** SUBMISSION MODALITIES *** - Long papers (8 content pages + references) should report on solid and finished research including new experimental results, resources and/or techniques. - Short papers (4 content pages + references) should report on small experiments, focused contributions, ongoing research, negative results and/or philosophical discussion. In regular research papers, the reported research should be substantially original. Papers available as preprints can also be submitted provided that they fulfil the conditions defined by the ACL Policies for Submission, Review and Citation <https://www.aclweb.org/portal/content/new-policies-submission-review-and-citation>. Notice that double submission to ACL-IJCNLP 2021 main conference and MWE 2021 is allowed but should be notified at submission time, as per the ACL-IJCNLP 2021 call for papers <https://2021.aclweb.org/calls/papers/#multiple-submission-policy>: "[...] papers can be dual-submitted to both ACL-IJCNLP 2021 and an ACL-IJCNLP 2021 workshop which has its submission deadline falling before our notification date of May 5, 2021." Submission is ***double-blind*** as per the ACL-IJCNLP 2021 guidelines <https://2021.aclweb.org/calls/papers/#paper-submission-information>. For all types of submission, the ACL-IJCNLP 2021 templates <https://2021.aclweb.org/calls/papers/#paper-submission-and-templates> must be used. There is no limit on the number of reference pages. An extra page will be allowed to take the reviewers' comments into account in the final versions of accepted papers (long = 9 content pages, short = 5 content pages). The decisions as to oral or poster presentations of the selected papers will be taken by the PC chairs, depending on the available infrastructure for participation (presential and/or virtual). No distinction between papers presented orally and as posters is made in the workshop proceedings. All papers should be submitted via the workshop's START space, available soon. Please choose the appropriate submission modality (long/short). *** CONTACT *** For any inquiries regarding the workshop please send an email to mwe...@gm... *** IMPORTANT DATES *** All deadlines are at 23:59 UTC-12 (anywhere in the world). - April 19, 2021: Paper Submission Deadline - May 28, 2021: Notification of Acceptance - June 7, 2021: Camera-ready papers due - August 5 or 6, 2021: Workshop (Date TBD) *** ORGANIZERS *** - *Program chairs: *Paul Cook, Jelena Mitrović, Carla Parra Escartín and Ashwini Vaidya - *Publication chairs:* Petya Osenova and Shiva Taslimipoor - *Communication chair: *Carlos Ramisch |
From: Israel C. <coh...@gm...> - 2021-01-07 13:47:54
|
The actual link is http://multiword.sourceforge.net/MWE_MAIN/2020-SIGLEX-MWE-yearly-report.pdf It was published on 8 January 2021. As an alternative, you can copy and paste the link that Carlos sent us into your browser's URL area and manually change 2019 to 2020. That's how I retrieved the current Report. Moral of this story: Don't believe everything you see. Regards to all, Izzy coh...@gm... On Thu, Jan 7, 2021 at 2:59 PM Carlos Ramisch <car...@gm...> wrote: > Dear all, > Sorry the link to the report was pointing to the 2019 report. > Here's the correct link for 2020: > http://multiword.sourceforge.net/MWE_MAIN/2020-SIGLEX-MWE-yearly-report.pdf > Carlos > > On Thu, Jan 7, 2021 at 1:37 AM Carlos Ramisch <car...@gm...> > wrote: > >> Dear SIGLEX-MWE Section members, >> >> Welcome to the new MWE Section mailing list! >> >> According to the Section's constitution >> <http://multiword.sourceforge.net/MWE_MAIN/SIGLEX-MWE-section-constitution-2020-11-21.pdf>, >> the MWE Section's Standing Committee should report yearly on the >> Section's activities to its members. >> >> Please, find the 2020 report at: >> >> http://multiword.sourceforge.net/MWE_MAIN/2020-SIGLEX-MWE-yearly-report.pdf >> <http://multiword.sourceforge.net/MWE_MAIN/2019-SIGLEX-MWE-yearly-report.pdf> >> >> Best regards, >> >> Carlos Ramisch (also on behalf of Agata Savary, Paul Cook, Jelena >> Mitrović, Petya Osenova, Carla Parra Escartín, Shiva Taslimipoor, Ashwini >> Vaidya) >> >> P.S.: if you want to modify your subscription to this mailing list, >> please update you SIGLEX membership information on the SIGLEX's members >> directory <https://siglex.org/members.html> >> > -- > You received this message because you are subscribed to the Google Groups > "SIGLEX-MWE Section members" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sig...@go.... > |
From: Carlos R. <car...@gm...> - 2021-01-07 12:59:28
|
Dear all, Sorry the link to the report was pointing to the 2019 report. Here's the correct link for 2020: http://multiword.sourceforge.net/MWE_MAIN/2020-SIGLEX-MWE-yearly-report.pdf Carlos On Thu, Jan 7, 2021 at 1:37 AM Carlos Ramisch <car...@gm...> wrote: > Dear SIGLEX-MWE Section members, > > Welcome to the new MWE Section mailing list! > > According to the Section's constitution > <http://multiword.sourceforge.net/MWE_MAIN/SIGLEX-MWE-section-constitution-2020-11-21.pdf>, > the MWE Section's Standing Committee should report yearly on the > Section's activities to its members. > > Please, find the 2020 report at: > http://multiword.sourceforge.net/MWE_MAIN/2020-SIGLEX-MWE-yearly-report.pdf > <http://multiword.sourceforge.net/MWE_MAIN/2019-SIGLEX-MWE-yearly-report.pdf> > > Best regards, > > Carlos Ramisch (also on behalf of Agata Savary, Paul Cook, Jelena > Mitrović, Petya Osenova, Carla Parra Escartín, Shiva Taslimipoor, Ashwini > Vaidya) > > P.S.: if you want to modify your subscription to this mailing list, please > update you SIGLEX membership information on the SIGLEX's members directory > <https://siglex.org/members.html> > |
From: Francis B. <bo...@ie...> - 2021-01-07 03:06:19
|
Thank you! On Thu, Jan 7, 2021 at 8:37 AM Carlos Ramisch <car...@gm...> wrote: > Dear SIGLEX-MWE Section members, > > Welcome to the new MWE Section mailing list! > > According to the Section's constitution > <http://multiword.sourceforge.net/MWE_MAIN/SIGLEX-MWE-section-constitution-2020-11-21.pdf>, > the MWE Section's Standing Committee should report yearly on the > Section's activities to its members. > > Please, find the 2020 report at: > http://multiword.sourceforge.net/MWE_MAIN/2020-SIGLEX-MWE-yearly-report.pdf > <http://multiword.sourceforge.net/MWE_MAIN/2019-SIGLEX-MWE-yearly-report.pdf> > > Best regards, > > Carlos Ramisch (also on behalf of Agata Savary, Paul Cook, Jelena > Mitrović, Petya Osenova, Carla Parra Escartín, Shiva Taslimipoor, Ashwini > Vaidya) > > P.S.: if you want to modify your subscription to this mailing list, please > update you SIGLEX membership information on the SIGLEX's members directory > <https://siglex.org/members.html> > > -- > You received this message because you are subscribed to the Google Groups > "SIGLEX-MWE Section members" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sig...@go.... > -- Francis Bond <http://www3.ntu.edu.sg/home/fcbond/> Division of Linguistics and Multilingual Studies Nanyang Technological University |
From: Carlos R. <car...@gm...> - 2021-01-07 00:37:27
|
Dear SIGLEX-MWE Section members, Welcome to the new MWE Section mailing list! According to the Section's constitution <http://multiword.sourceforge.net/MWE_MAIN/SIGLEX-MWE-section-constitution-2020-11-21.pdf>, the MWE Section's Standing Committee should report yearly on the Section's activities to its members. Please, find the 2020 report at: http://multiword.sourceforge.net/MWE_MAIN/2020-SIGLEX-MWE-yearly-report.pdf <http://multiword.sourceforge.net/MWE_MAIN/2019-SIGLEX-MWE-yearly-report.pdf> Best regards, Carlos Ramisch (also on behalf of Agata Savary, Paul Cook, Jelena Mitrović, Petya Osenova, Carla Parra Escartín, Shiva Taslimipoor, Ashwini Vaidya) P.S.: if you want to modify your subscription to this mailing list, please update you SIGLEX membership information on the SIGLEX's members directory <https://siglex.org/members.html> |
From: Valia K. <eva...@an...> - 2020-12-23 11:44:31
|
Apologies for cross-postings ------------------------------------------------------ Call for Papers Workshop on "Benchmarking: Past, Present and Future" Co-located with ACL-IJCNLP 2021 to be held in Bangkok, on August 5-6, 2021 Webpage: https://github.com/kwchurch/Benchmarking_past_present_future/blob/master/README.md Important Dates * April 26, 2021: Paper submission deadline * May 28, 2021: Notification of acceptance * June 7, 2021: Camera-ready papers due * August 5-6, 2021: Workshop dates -------------------------------------------------------- It is easier to talk about the past than the future. These days, benchmarks evolve more bottom up (such as papers with code). There used to be more top-down leadership from government (and industry, in the case of systems, with benchmarks such as SPEC). Going forward, there may be more top-down leadership from organizations like MLPerf and/or influencers like David Ferrucci, who was responsible for IBM’s success with Jeopardy, and has recently written a paper suggesting how the community should think about benchmarking for machine comprehension (To Test Machine Comprehension, Start by Defining Comprehension). Tasks such as reading comprehension become even more interesting as we move beyond English. Multilinguality introduces many challenges, and even more opportunities. Keynote Talks We have an amazing collection of invited talks, many with direct first-hand knowledge of the history, and many insights for the future: 1. Past a. John Makhoul b. Mark Liberman: Reproducible Research and the Common Task Method c. Ellen Voorhees 2. Present a. Ming Zhou (Microsoft) b. Hua Wu and Jing Liu (Baidu) c. Neville Ryant DIHARD d. Brian MacWhinney and Saturnino Haider, Dementia Challenge e. Samuel Bowman (GLUE) f. Douwe Kiela (https://dynabench.org/) g. Eunsol Choi 3. Future a. MLPerf Greg Diamos The 2021 SpeechNet Challenge b. David Ferrucci c. Ido Dagan Submissions We accept three types of submissions, long papers, short papers and abstracts, all following the ACL2021 style, and the ACL submission policy: https://www.aclweb.org/adminwiki/index.php?title=ACL_Policies_for_Submission,_Review_and_Citation Long papers may consist of up to eight (8) pages of content, plus unlimited references, short papers may consist of up to four (4) pages of content; final versions will be given one additional page of content so that reviewers' comments can be taken into account. Abstracts may consist of up to two (2) pages of content, plus unlimited references but will not be given any additional page upon acceptance. Submissions should be sent in electronic forms, using the Softconf START conference management system. The submission site will be announced on the workshop page once available. We invite original research papers from a wide range of topics, including but not limited to: 1. What important technologies and underlying sciences need to be fostered, now and in the future? 2. In each case, are there existing tasks/benchmarks that move the field in the right direction? 3. Where are there gaps? 4. For the gaps, are there initial steps that are accessible, attractive, and cost effective? 5. How large should a benchmark be? a. How much data do we need to measure significant differences? b. How much data do machines need to obtain good performance? c. How much data do babies need to learn language? Submissions are open to all, and are to be submitted anonymously. All papers will be refereed through a double-blind peer review process by at least three reviewers with final acceptance decisions made by the workshop organizers. The workshop is scheduled to last for one day either August 5th or 6th. If you have any questions, contact us at pc-...@go... Workshop organizers Kenneth Church (Baidu USA) Mark Liberman (University of Pennsylvania) Valia Kordoni (Humboldt-Universität zu Berlin) |
From: Carlos R. <car...@li...> - 2020-12-01 10:53:17
|
Joint Workshop on Multiword Expressions and Electronic Lexicons (MWE-LEX 2020) *Workshop at **COLING 2020* <http://coling2020.org/>*, December 13, 2020.* Organized and sponsored by: Special Interest Group on the Lexicon (SIGLEX <http://www.siglex.org>) of the Association for Computational Linguistics (ACL <https://www.aclweb.org/portal/>) ELEXIS <https://elex.is/> - European Lexicographic Infrastructure. This joint event is the 16th edition of the *Workshop on Multiword Expressions (**MWE* <http://multiword.sourceforge.net/PHITE.php?sitesig=CONF>*)*. *CALL FOR PARTICIPATION* We would like to inform you that the MWE-LEX 2020 program can be accessed at: http://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_02_MWE-LEX_2020___lb__COLING__rb__&subpage=CONF_20_Program We are happy to announce that Prof. Roberto Navigli will give the Keynote Speech: http://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_02_MWE-LEX_2020___lb__COLING__rb__&subpage=CONF_10_Keynote_Speaker You can access MWE-LEX 2020 home page at: http://multiword.sourceforge.net/mwelex2020/ We would like to let you know that in order to participate you should register at https://coling2020.org/pages/registration.html (if you have not done so already). Looking forward to seeing you at the workshop MWE-LEX 2020 organizers |
From: Carlos R. <car...@li...> - 2020-11-19 13:25:28
|
Dear MWE Section members, We would like to ask for 5min of your time to (re-)register as a SIGLEX (and MWE) member: https://docs.google.com/forms/d/e/1FAIpQLSfldnrynfsqwMu_xwI-c8nxajUUeALJd9INhEPcSb8zCD-GBQ/viewform?usp=sf_link Membership is free and open to anyone interested in MWE research. Registered members can vote for SIGLEX board, including the Section representative. Don't forget to tick the MWE Section box ;-) Thanks Carlos ---------- Forwarded message --------- From: Preslav Nakov <pre...@gm...> Date: Thu, Nov 19, 2020 at 6:57 AM Subject: SIGLEX website re-registration needed! (also MWE and SemEval) To: Preslav Nakov <pre...@gm...> (Apologies for the spam) Dear all, The SIGLEX community (and its MWE and SemEval sections) have migrated to a new website: https://siglex.org/ Now, we would like to kindly ask all members to re-register using the registration form: https://docs.google.com/forms/d/e/1FAIpQLSfldnrynfsqwMu_xwI-c8nxajUUeALJd9INhEPcSb8zCD-GBQ/viewform?usp=sf_link Many thanks to Steven for taking care of this! Regards, Preslav |
From: Carlos R. <car...@li...> - 2020-10-28 14:15:57
|
Dear MWE community, We have sent a proposal for the MWE workshop in 2021: http://multiword.sourceforge.net/mwe2021/ Please, fill in the following survey to support our proposal: https://docs.google.com/forms/d/e/1FAIpQLScYZ7vPmIo72mGP1-reuvuRGM835DdMeX9zSg6qx7iuwyGWPQ/formResponse MWE 2021 appears in the "Applications, including bioNLP and finance" section. Best, Carlos, on behalf of the SIGLEX-MWE Standing Committee -- Carlos RAMISCH http://pageperso.lis-lab.fr/carlos.ramisch Assistant professor at LIS/TALEP <https://www.lis-lab.fr/talep/> and Aix Marseille University, France |
From: Julia B. G. <jb...@un...> - 2020-10-28 06:15:18
|
Apologies for cross-posting ====== Final Call for Papers: Special Issue on *Latest Advancements in Linguistic Linked Data* http://www.semantic-web-journal.net/blog/call-papers-special-issue-latest-advancements-linguistic-linked-data Contact email: swj...@go... *Deadline: 25th of January, 2021 (extended) * ====== In recent years, various efforts have arisen with regard to the representation and publication of linguistic resources such as lexicons, dictionaries, corpora, terminologies and linguistic ontologies. These efforts have exploited Semantic Web technologies and the Linguistic Linked Data (LLD) publication paradigm to facilitate and enhance the discovery, interoperability, integration and reusability of language resources. Initiatives such as the H2020 projects ELEXIS and Prêt-à-LLOD and the COST Action NexusLinguarum aim at developing robust ecosystems and networks of experts to address the LLD lifecycle, from identifying the requirements concerning the representation of linguistic resources to their exploitation by natural language processing (NLP) applications. With the rapid growth of the Linguistic Linked Open Data (LLOD) cloud and the increasing interest in the use of linked data for NLP, new challenges emerge concerning particular use cases and domain applications, language-specific features and quality dimensions, the evolution of LLD resources throughout time and the leverage of linguistic resources along LD technologies in NLP research, among other diverse aspects. This special issue on the latest advancements in LLD invites high-quality contributions, supported by a robust evaluation, which present an advancement in the state-of-the-art in the field of LLD methodologies and technologies and their use for NLP and provide insights into the new challenges ahead. The list of topics includes, but is not limited to, the following: - Knowledge Representation for Linguistic Data - Ontologies, vocabularies and linguistic category registries for linguistic data - Representation languages for linguistic data as LLD - Modelling challenges with state-of-the-art LLD models (e.g. OntoLex-Lemon) - Use case-based representation requirements for LLD - Ontology engineering for linguistic data representation: building, evaluation, evolution, alignment and reuse of ontologies for computational linguistics and NLP - LLD Generation and Evolution - Methodologies and workflows for LLD generation - Diachronic and sociolinguistic approaches to LLD generation and evolution - Innovative approaches to automatic LLD generation - Technically robust and systematically evaluated LLD resources - LLD for under-resourced and underrepresented languages and domains - Linking LLD sets across multiple dimensions and levels of linguistic description - LLD quality evaluation and resource curation - LLD extension, enrichment and evolution - LLD Publication, Querying and Visualization - Publication and metadata - IPR, licensing and privacy issues - LLD specific query techniques and languages - Supporting interfaces for different steps of the LLD lifecycle - Visualization of LLD - LLD and NLP research - LLD for NLP and NLP for LLD - Integration, exploitation and added value of LLD technologies and interoperable linguistic resources in NLP systems - LLD in Deep Learning-based NLP approaches - LLD in Big Data contexts - Applications and Use Cases - Automatic approaches for different steps of the LLD lifecycle - Knowledge extraction and representation from linguistic resources - LLD for research in specific domains (e.g. linguistics, digital humanities, life sciences, law, journalism, etc.) - LLD specific features and requirements from domain experts DeadlineSubmission deadline: 20th of November, 2020 25th of January, 2021 (extended). Papers submitted before the deadline will be reviewed upon receipt. Guest editors The guest editors can be reached at swj...@go... . Julia Bosque-Gil, University of Zaragoza, Spain Milan Dojchinovski, Czech Technical University in Prague, Czech Republic Marieke van Erp, KNAW Humanities Cluster, Amsterdam, Netherlands Christian Chiarcos, Goethe Universität Frankfurt, Germany Philipp Cimiano, Bielefeld University, Germany -- Julia Bosque-Gil Aragon Institute of Engineering Research (I3A) University of Zaragoza Pronouns: she/her <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Libre de virus. www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> |
From: Stella M. <sti...@gm...> - 2020-10-06 19:00:31
|
The MWE-LEX 2020 workshop is inviting authors of accepted papers at "Findings of EMNLP" to present their work at our workshop. To submit your paper for a presentation slot, please send an email to mwe...@gm... by Friday October 9, 2020 with: * Your paper * One or two sentences explaining why it would be a good fit for the scope of MWE-LEX 2020 The MWE-LEX 2020 Organizing Committee |
From: Julia B. G. <jb...@un...> - 2020-09-23 12:04:45
|
Apologies for cross-posting ====== 2nd Call for Papers: Special Issue on *Latest Advancements in Linguistic Linked Data* http://www.semantic-web-journal.net/blog/call-papers-special-issue-latest-advancements-linguistic-linked-data Contact email: swj...@go... *Deadline: 20th of November, 2020 * ====== In recent years, various efforts have arisen with regard to the representation and publication of linguistic resources such as lexicons, dictionaries, corpora, terminologies and linguistic ontologies. These efforts have exploited Semantic Web technologies and the Linguistic Linked Data (LLD) publication paradigm to facilitate and enhance the discovery, interoperability, integration and reusability of language resources. Initiatives such as the H2020 projects ELEXIS and Prêt-à-LLOD and the COST Action NexusLinguarum aim at developing robust ecosystems and networks of experts to address the LLD lifecycle, from identifying the requirements concerning the representation of linguistic resources to their exploitation by natural language processing (NLP) applications. With the rapid growth of the Linguistic Linked Open Data (LLOD) cloud and the increasing interest in the use of linked data for NLP, new challenges emerge concerning particular use cases and domain applications, language-specific features and quality dimensions, the evolution of LLD resources throughout time and the leverage of linguistic resources along LD technologies in NLP research, among other diverse aspects. This special issue on the latest advancements in LLD invites high-quality contributions, supported by a robust evaluation, which present an advancement in the state-of-the-art in the field of LLD methodologies and technologies and their use for NLP and provide insights into the new challenges ahead. The list of topics includes, but is not limited to, the following: - Knowledge Representation for Linguistic Data - Ontologies, vocabularies and linguistic category registries for linguistic data - Representation languages for linguistic data as LLD - Modelling challenges with state-of-the-art LLD models (e.g. OntoLex-Lemon) - Use case-based representation requirements for LLD - Ontology engineering for linguistic data representation: building, evaluation, evolution, alignment and reuse of ontologies for computational linguistics and NLP - LLD Generation and Evolution - Methodologies and workflows for LLD generation - Diachronic and sociolinguistic approaches to LLD generation and evolution - Innovative approaches to automatic LLD generation - Technically robust and systematically evaluated LLD resources - LLD for under-resourced and underrepresented languages and domains - Linking LLD sets across multiple dimensions and levels of linguistic description - LLD quality evaluation and resource curation - LLD extension, enrichment and evolution - LLD Publication, Querying and Visualization - Publication and metadata - IPR, licensing and privacy issues - LLD specific query techniques and languages - Supporting interfaces for different steps of the LLD lifecycle - Visualization of LLD - LLD and NLP research - LLD for NLP and NLP for LLD - Integration, exploitation and added value of LLD technologies and interoperable linguistic resources in NLP systems - LLD in Deep Learning-based NLP approaches - LLD in Big Data contexts - Applications and Use Cases - Automatic approaches for different steps of the LLD lifecycle - Knowledge extraction and representation from linguistic resources - LLD for research in specific domains (e.g. linguistics, digital humanities, life sciences, law, journalism, etc.) - LLD specific features and requirements from domain experts DeadlineSubmission deadline: 20th of November, 2020. Papers submitted before the deadline will be reviewed upon receipt. Guest editors The guest editors can be reached at swj...@go... . Julia Bosque-Gil, University of Zaragoza, Spain Milan Dojchinovski, Czech Technical University in Prague, Czech Republic Marieke van Erp, KNAW Humanities Cluster, Amsterdam, Netherlands Christian Chiarcos, Goethe Universität Frankfurt, Germany Philipp Cimiano, Bielefeld University, Germany -- Julia Bosque-Gil Aragon Institute of Engineering Research (I3A) University of Zaragoza Pronouns: she/her |
From: Carlos R. <car...@li...> - 2020-09-02 18:11:52
|
Dear all, We remind you that the deadline to apply for the SIGLEX-MWE section's Standing Committee <http://multiword.sourceforge.net/PHITE.php?sitesig=MWE#sc> is this *Friday, September 4.* If you have any questions about the committee's functions, do not hesitate to contact me and/or the current SC members. Any member of the MWE-SIGLEX section can send an expression of interest via the online form <https://forms.gle/nHj8UrvwNRUSiezK9>. We are looking forward to your submissions and to working together for the MWE community! Carlos Ramisch SIGLEX-MWE section representative on behalf of the Standing Committee On Fri, Jul 31, 2020 at 1:00 PM Carlos Ramisch <car...@li...> wrote: > Dear SIGLEX-MWE Section members, > > This is a call for officers of the SIGLEX-MWE Section Standing Committee > <http://multiword.sourceforge.net/PHITE.php?sitesig=MWE#sc> (SC). > > According to the Section's constitution > <http://multiword.sourceforge.net/MWE_MAIN/SIGLEX-MWE-section-constitution-2017-08-23.pdf>, > the SC consists of one elected representative and 4 nominated officers. > The nominated officers are selected by the SIGLEX <http://www.siglex.org/> > board from a list proposed by the Section representative. > The duration of the term of an SC nominated officer is *2 years*. > The SC officers must be members of the Section (and of SIGLEX) and have > published research work in topics related to multiword expressions. > The duties of the SC are defined by the constitution. > > This year, 2 officers are stepping down, and 2 new officers will be > nominated. > If you are interested in becoming one of them, and influencing future > developments of the MWE community, please, submit your expression of > interest via the web form <https://forms.gle/nHj8UrvwNRUSiezK9> until *September > 4, 2020*. > > We are looking forward to your submissions and to working together for the > MWE community! > > Carlos Ramisch > SIGLEX-MWE section representative > on behalf of the Standing Committee > |
From: Carlos R. <car...@li...> - 2020-07-31 11:02:21
|
Dear SIGLEX-MWE Section members, This is a call for officers of the SIGLEX-MWE Section Standing Committee <http://multiword.sourceforge.net/PHITE.php?sitesig=MWE#sc> (SC). According to the Section's constitution <http://multiword.sourceforge.net/MWE_MAIN/SIGLEX-MWE-section-constitution-2017-08-23.pdf>, the SC consists of one elected representative and 4 nominated officers. The nominated officers are selected by the SIGLEX <http://www.siglex.org/> board from a list proposed by the Section representative. The duration of the term of an SC nominated officer is *2 years*. The SC officers must be members of the Section (and of SIGLEX) and have published research work in topics related to multiword expressions. The duties of the SC are defined by the constitution. This year, 2 officers are stepping down, and 2 new officers will be nominated. If you are interested in becoming one of them, and influencing future developments of the MWE community, please, submit your expression of interest via the web form <https://forms.gle/nHj8UrvwNRUSiezK9> until *September 4, 2020*. We are looking forward to your submissions and to working together for the MWE community! Carlos Ramisch SIGLEX-MWE section representative on behalf of the Standing Committee |
From: Carlos R. <car...@li...> - 2020-07-01 07:57:27
|
Dear PARSEMErs, *The evaluation phase of the PARSEME shared task 1.2 on semi-supervised identification of verbal MWEs has just started!* We have released the blind test data for all 14 languages on our public Gitlab repo: https://gitlab.com/parseme/sharedtask-data/ You can also use the larger unannotated corpora available here (also in closed track): https://gitlab.com/parseme/corpora/-/wikis/Raw-corpora-for-the-PARSEME-1.2-shared-task This year's focus is on unseen VMWEs: the general ranking will emphasize results on unseen VMWEs. The *deadline* for the submission of results was *extended to July 6* (anywhere in the world). Results submission is to be made on the MWE-LEX softconf page: https://www.softconf.com/coling2020/MWE-LEX/ Results must be a single compressed archive ("*zip*") with one folder per language, named according to the 2-letter language code (e.g. GA/ for Irish). Each output must be named *test.system.cupt* and conform to the *.cupt* format <http://multiword.sourceforge.net/cupt-format>. Before submitting, please, download the format validation script <https://gitlab.com/parseme/sharedtask-data/blob/master/1.2/bin/validate_cupt.py> and check the format as follows: ./validate_cupt.py --input test.system.cupt If you participate in both the closed and open tracks, please make distinct submissions for each. Each team can submit 2 results per track, i.e. at most 4 in total (with one result per language in each submission). It is not mandatory to cover all languages, but then the macro-averages will not be comparable to other systems. Subscribe and use the participants' mailing list if you find a bug or if you have questions: ver...@go... To reach the organizers, you can write to Par...@nl... Best Agata, Ashwini, Bruno, Carlos, Jakub, Marie |
From: Agata S. <aga...@un...> - 2020-06-19 11:18:32
|
* PARSEME shared task 1.2 on semi-supervised identification of verbal multiword expressions http://multiword.sourceforge.net/sharedtask2020 Final call for participation (Apologies for cross-posting) The third edition of the PARSEME shared task on automatic identification of verbal multiword expressions (VMWEs) aims at identifying **verbal MWEs**in running texts. Verbal MWEs include, among others, verbal idioms (to let the cat out of the bag), light-verb constructions (to make a decision), verb-particle constructions (to give up), multi-verb constructions (to make do) and inherently reflexive verbs (s'évanouir 'to faint'in French). Their identification is a well-known challenge for NLP applications, due to their complex characteristics including discontinuity, overlaps, non-compositionality, heterogeneity and syntactic variability. Editions 1.0 <http://multiword.sourceforge.net/sharedtask2017/>(2017) and 1.1 <http://multiword.sourceforge.net/sharedtask2018/>(2018) have shown that, while some systems reach high performance (F1>0.7) for identifying VMWEs that were seen in training corpus, performance on unseen VMWEs is very low (F1<0.2). Hence for this third edition, **emphasis will be put on discovering VMWEs that were not seen in the training corpus**. We kindly ask potential participant teams to register using the expression of interest form: https://docs.google.com/forms/d/e/1FAIpQLSfcmbd6MmKjFuBxCoaTWGCPGqoH5FoJ-th8IAZk3kh_ECDaZQ/viewform?usp=sf_link Task updates and questions will be posted on the shared task website: http://multiword.sourceforge.net/sharedtask2020 and announced on our public mailing list (anyone can join): http://groups.google.com/group/verbalmwe #### Publication and workshop Shared task participants will be invited to submit a system description paper to a special track of the Joint Workshop on Multiword Expressions and Electronic Lexicons (MWE-LEX 2020), at COLING 2020, to be held on December 13, 2020, in Barcelona, Spain (postponed): http://multiword.sourceforge.net/mwelex2020 Submitted system description papers must follow the workshop submission instructions and will go through double-blind peer reviewing. Their acceptance depends on the quality of the paper rather than on the ranking in the shared task. Authors of the accepted papers will present their work as posters/demos in a dedicated session of the MWE-LEX 2020 workshop. The submission of a system description paper is not mandatory. Due to double blind review, participants are asked to provide a nickname (i.e. a name that does not identify authors, universities, research groups etc.) for their systems when submitting results and system description papers. #### Provided corpora The PARSEME team has prepared corpora in which VMWEs were manually annotated: https://gitlab.com/parseme/corpora/wikis/home. The provided annotations follow the PARSEME 1.2 guidelines: https://parsemefr.lis-lab.fr/parseme-st-guidelines/1.2/. On March 23, 2020, we released, for each language: * a training corpusmanually annotated for VMWEs; * a development corpusto tune/optimize the systems' parameters ; and * a syntactically parsed raw corpus, not annotated for VMWEs, to support semi- and unsupervised methods for VMWE discovery (for each language, the size is between 12 million tokens and 2.5 billion tokens) On July 1, 2020, we will release, for each language: * A blind test corpusto be used as input to the systems during the evaluation phase, during which the VMWE annotations will be kept secret. On July 3, 2020, participants will have to upload their annotated version of the test corpus at https://www.softconf.com/coling2020/MWE-LEX/ Morphosyntactic annotations (parts of speech, lemmas, morphological features, and syntactic dependencies) are also provided, both for annotated and raw corpora. Depending on the language, the information comes from treebanks (mostly Universal Dependencies v2) or from automatic parsers trained on UD v2 treebanks (e.g., UDPipe). The annotated training and development corpora are released in the CUPT format <http://multiword.sourceforge.net/cupt-format/>(which is the CoNLL-U format with an extra column for the MWE annotations). The raw corpora are released in the CoNLL-U format <https://universaldependencies.org/format>. The blind test corpus will be released in the CUPT format, with an underspecified 11th column to be predicted. Reference annotations for the test copus will be released after the evaluation phase. The trial data, training and dev sets are available on the shared task's release repository: https://gitlab.com/parseme/sharedtask-data/tree/master/1.2 The raw corpus is available on the corpus initiative website: https://gitlab.com/parseme/corpora/wikis/Raw-corpora-for-the-PARSEME-1.2-shared-task Corpora are available for the following languages: German (DE), Greek (EL), Basque (EU), French (FR), Irish (GA), Hebrew (HE), Hindi (HI), Italian (IT), Polish (PL), Brazilian Portuguese (PT), Romanian (RO), Swedish (SV), Turkish (TR), Chinese (ZH). The amount of annotated data in the training, development, test, and raw corpus depends on the language. #### Tracks System results can be submitted in two tracks: * Closed track: Systems using only the provided training and development corpora (with VMWE and morpho-syntactic annotations) + provided raw corpora. * Open track: Systems using or not the provided training corpus, plus any additional resources deemed useful (MWE lexicons, symbolic grammars, wordnets, other raw corpora, word embeddings and language models trained on external data, etc.). This track includes notably purely symbolic and rule-based systems. In both tracks, the use of the corpora from the previous PARSEME shared tasks, and from the PARSEME source repositories <https://gitlab.com/parseme/corpora/-/wikis/home#active-languages>, is strictly forbidden, as material may have moved during corpus splits. Teams submitting systems in the open track will be requested to describe and provide references to all resources used at submission time. Teams are encouraged to favor freely available resources for better reproducibility of their results. #### Evaluation metrics Participants will provide the output produced by their systems on the test corpus in the CUPT format, with the 11th column containing their predictions. This output will be compared with the gold standard (ground truth) using both generic and specialised precision, recall and F1 scores. The evaluation metrics will be the same as for the 1.1 edition, as described in: http://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_04_LAW-MWE-CxG_2018___lb__COLING__rb__&subpage=CONF_50_Evaluation_metrics Note that for the 1.2 edition the published general ranking will emphasize 3 metrics: * global MWE-based * global Token-based * unseen MWE-based A VMWE from the test corpus is considered seen if a VMWE with the same (multi-)set of lemmas is annotated at least once in the training or development corpus. #### Corpus split For each language, the annotated sentences are shuffled and split, in a way which ensures that there is a minimum of 300 VMWEs in the test set which are unseen in the training + dev sets. This means that the natural sequence of sentences in a document will not be respected in the proposed corpus split. Note the unseen ratio, that is, the proportion of unseen VMWEs wrt all VMWEs in the test set, may vary across languages. To guide participants on this hard task, the number and rate of unseen VMWEs for the dev corpora are available on the shared task website. In both tracks, the use of previous shared task editions' corpora, and from the PARSEME source repositories <https://gitlab.com/parseme/corpora/-/wikis/home#active-languages>, is strictly forbidden, as material may have moved during corpus splits. #### Important dates (updated) * Feb 19, 2020: trial data and evaluation script released * Mar 23, 2020: training and development corpus + raw corpus released * Jul 01, 2020: blind test corpus released *Jul 03, 2020: submission of system results * Jul 09, 2020: announcement of results * Sep 02, 2020: shared task system description papers due (same as regular papers) * Oct 16, 2020: notification of acceptance * Nov 01, 2020: camera-ready system description papers due * Dec 13, 2020: shared task session at theMWE-LEX 2020 <http://multiword.sourceforge.net/mwelex2020>workshop at Coling 2020 #### Organizing team Carlos Ramisch, Marie Candito, Bruno Guillaume, Agata Savary, Ashwini Vaidya, and Jakub Waszczuk Contact: par...@nl... <mailto:par...@nl...>* |
From: John P. M. <jo...@mc...> - 2020-06-13 07:19:53
|
Apologies for cross-posting ------ Due to the global situation and cancellation of the LREC 2020 conference, our workshop, Linked Data in Linguistics will take place virtually on June 22nd and 23rd. The program and details about participation can be found here: http://ldl2020.linguistic-lod.org/program.html soon. Meanwhile, the proceedings are already published on the LREC website: https://lrec2020.lrec-conf.org/en/workshops-and-tutorials/2020-workshops/ In order to participate in the workshop, please register in advance, a link to a registration form can be found here: https://forms.gle/cK8TqpiqDBEWQRk7A Information about the workshop: Since its establishment in 2012, the Linked Data in Linguistics (LDL) workshop series has become the major forum for presenting, discussing and disseminating technologies, vocabularies, resources and experiences regarding the application of semantic technologies and the Linked Open Data (LOD) paradigm to language resources in order to facilitate their visibility, accessibility, interoperability, reusability, enrichment, combined evaluation and integration. The LDL workshop series is organized by the Open Linguistics Working Group of the Open Knowledge Foundation and has contributed greatly to the emergence and growth of the Linguistic Linked Open Data (LLOD) cloud. LDL workshops contribute to the discussion, dissemination and establishment of community standards that drive this development, most notably the OntoLex-lemon model for lexical resources, as well as standards for other types of language resources still under development. Past years have seen a growing interest in the application of knowledge graphs and Semantic Web technologies to language resources, and their publication as linked data on the Web. As of today, a large number of language resources were either converted or created natively as linked data on the basis of data models specifically designed for the representation of linguistic content. Examples are wordnets, dictionaries, corpora — research papers describing the creation of these resources were presented at the previous editions of both LREC and LDL. At the same time, the growth of the LLOD cloud is far from over: new use-cases call for new data models and new resources to be created or converted. However, even though a critical mass of LLOD is already in place, there is still a pressing need for a robust ecosystem of tools that consume linguistic linked data. Recently started research networks and European projects, such as NexusLinguarum, ELEXIS, and Prêt-à-LLOD are working in the direction of building sustainable infrastructures around LRs, with linked data as one of the core technologies. By collocating the 7th edition of the workshop series with LREC, we encourage this interdisciplinary community to participate in the dialogue on these issues, to present and to discuss use cases, experiences, best practices, recommendations and technologies among each other and in interaction with the language resource community. The LDL workshop series has a general focus on LOD-based resources, vocabularies, infrastructures and technologies as means for managing, improving and using language resources on the Web. As technology and resources increasingly converge towards a LOD-based ecosystem, we particularly encourage submissions on Linked-Data Aware Tools and Services and Linked Language Resources Infrastructure, i.e. managing, curating and applying LLOD technologies and resources in a reliable and reproducible way for the needs of linguistics, NLP and digital humanities. |
From: Ekaterina S. <ka...@ic...> - 2020-04-13 18:32:26
|
FINAL CALL FOR PAPERS: Note the *new submission deadline* due to the COVID-19 situation ACL 2020 Workshop on Figurative Language Processing July 9, 2020 https://sites.google.com/view/figlang2020/ Submission deadline: April 23, 2020 WORKSHOP DESCRIPTION Figurative language processing is a rapidly growing area in NLP, including processing of metaphors, idioms, puns, irony, sarcasm, as well as other figures. Characteristic to all areas of human activity (from poetic to ordinary to scientific) and, thus, to all types of discourse, figurative language becomes an important problem for NLP systems. Its ubiquity in language has been established in a number of corpus studies and the role it plays in human reasoning has been confirmed in psychological experiments. This makes figurative language an important research area for computational and cognitive linguistics, and its automatic identification and interpretation indispensable for any semantics-oriented NLP application. The main focus of the workshop will be on computational modelling of figurative language using state-of-the-art NLP techniques. However, papers on cognitive, linguistic, social, rhetorical, and applied aspects are also of interest, provided that they are presented within a computational, a formal, or a quantitative framework. In addition, we will also conduct two shared tasks on metaphor and sarcasm detection. The workshop invites both full papers and short papers for either oral or poster presentation. Submission site: https://www.softconf.com/acl2020/flp/ IMPORTANT DATES April 23, 2020 Paper submissions due (23:59 West Coast USA time) May 15, 2020 Notification of acceptance May 25, 2020 Camera-ready papers due July 9, 2020 Workshop (taking place virtually, alongside ACL 2020) WORKSHOP CO-CHAIRS Beata Beigman Klebanov, Educational Testing Service, USA Ekaterina Shutova, University of Amsterdam, The Netherlands Smaranda Muresan, Columbia University, USA Patricia Lichtenstein, University of California, Merced, USA Ben Leong, Educational Testing Service, USA Anna Feldman, Montclair State University, USA Debanjan Ghosh, Educational Testing Service, USA |
From: <jm...@un...> - 2020-04-08 17:07:44
|
*** Apologies for cross postings *** ------------------------------------------------------------------------- CLiC-it 2020 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Seventh Italian Conference on Computational Linguistics ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ November 30th – December 2nd, 2020 Bologna Conference Announcement and First Call for Papers http://clic2020.ilc.cnr.it --------- The Italian Conference on Computational Linguistics, CLiC-it, aims at establishing a reference forum for the Italian community of researchers working in the fields of Computational Linguistics (CL) and Natural Language Processing (NLP). CLiC-it promotes and disseminates high-level, original research on all aspects of automatic language processing, both written and spoken, and targets state-of-the-art theoretical results, experimental methodologies, technologies, as well as application perspectives, which may contribute to the advancement of the CL and NLP fields. The spirit of the conference is inclusive. In the conviction that the complexity of language phenomena needs cross-disciplinary competences, CLiC-it intends to bring together researchers of related disciplines such as Computational Linguistics, Natural Language Processing, Linguistics, Cognitive Science, Machine Learning, Computer Science, Knowledge Representation, Information Retrieval, and Digital Humanities. CLiC-it is open to contributions on all languages, with a particular emphasis on Italian. The seventh edition of CLiC-it will be held in Bologna, on November 30th – December 2nd, 2020. The conference will be followed by EVALITA 2020 (http://www.evalita.it/2020), the 7th evaluation campaign of Natural Language Processing and speech tools for the Italian language. Both CLiC-it and EVALITA are initiatives of the Italian Association of Computational Linguistics (AILC — http://www.ai-lc.it). We know, due to the COVID-19 pandemia, the situation is uncertain, but we have to think positively, think about rebuilding and reinforcing the community and therefore we believe that we can all see each other in December in Bologna enjoying the event and, if we eventually cannot do it, we will set up some technical solution to make it electronically and publish the proceedings. This important moment for exchanging thoughts, works, solutions and affect will be preserved, in some way. Requirements --------- The conference invites the submission of papers on all aspects of automated language processing. Relevant topics for the conference include, but are not limited to, the following areas: Dialogue, Discourse and Natural Language Generation Information Extraction, Information Retrieval and Question Answering Language Resources and Evaluation Language and Cognition Linguistic Issues in CL and NLP Machine Learning for NLP Machine Translation and Multilinguality Morphology and Syntax Processing NLP for Digital Humanities NLP for Web and Social Media Pragmatics and Creativity Research and Industrial NLP Applications Semantics and Knowledge Representation Spoken Language Processing and Automatic Speech Understanding Vision, Robotics, Multimodal and Grounding CLiC-it 2020 has the goal of a broad technical program. We invite papers in theoretical computational linguistics, empirical/data-driven approaches, resources and their evaluation, as well as NLP applications and tools. We also invite papers describing a challenge in the field, position papers, survey papers, and papers that describe a negative result. We are also favouring a parallel submission policy for outstanding papers that have been submitted and accepted elsewhere in 2020. If you are the author of a paper accepted at a major international CL conference or journal in 2020, you can present your work at CLiC-it 2020 in the form of a short research communication, within a dedicated session at the conference. Research communications will not be published in the proceedings, but are mostly intended to enforce dissemination of excellence in research within the Italian CL community. Submission Format ————————– Papers may consist of up to four (4) pages of content, and two (2) additional pages of references. Papers can be either in English or Italian, with the abstract both in English and Italian. Accepted papers will be published on-line and will be presented at the conference either orally or as a poster. For research communications (see above) a two (2) page abstract is required. The deadline for all types of submissions is July 15, 2020. Submissions should follow the ACL two-column format. We strongly recommend the use of LaTeX style files or Microsoft Word style files according to the ACL format, which will be available on the conference website under “Information for Authors”. Submission must be electronic in PDF, using the Easychair submission software. Reviewing will NOT be blind, so there is no need to remove author information from manuscripts. Important Dates --------- 15/07/2020: Paper submission deadline 23/09/2020: Notification to authors of reviewing outcome 15/10/2020: Camera-ready version of accepted papers 30/11 - 2/12/2020: CLiC-it Conference, Bologna People --------- Program co-chairs: Felice Dell’Orletta (Istituto di Linguistica Computazionale “A.Zampolli” – CNR) Johanna Monti (Università di Napoli “L’Orientale”) Fabio Tamburini (Università di Bologna) Further information --------- Conference website: http://clic2020.ilc.cnr.it/ Mail: cli...@gm... |
From: Agata S. <aga...@gm...> - 2020-03-31 14:33:21
|
* The role of constituents in multiword expressions: An interdisciplinary, cross-lingual perspective Sabine Schulte im Walde and Eva Smolka (eds.) (editors) Volume 4 of Phraseology and Multiword Expressions(PMWE), a book series at Language Science Press (LSP) Book URL:http://langsci-press.org/catalog/book/239 Electronic ISBN: 978-3-96110-184-9 Pages: 209 Price: Europe EURO 0 Comment: Open Access Synopsis Multiword expressions (MWEs), including noun compounds (such as nicknamein English and Ohrwurmin German), complex verbs (such as give upin English and aufgebenin German) and idioms (such as break the icein English and das Eis brechenin German), may be interpreted literally but often undergo meaning shifts with respect to their constituents. Theoretical, psycholinguistic as well as computational linguistic research remain puzzled by when and how MWEs receive literal vs. meaning-shifted interpretations, what the contributions of the MWE constituents are to the degree of semantic transparency (i.e., meaning compositionality) of the MWE, and how literal vs. meaning-shifted MWEs are processed and computed. This edited volume presents an interdisciplinary selection of seven papers on recent findings across linguistic, psycholinguistic, corpus-based and computational research fields and perspectives, discussing the interaction of constituent properties and MWE meanings, and how the constituents contribute to the processing and representation of MWEs. The collection is based on a workshop at the 2017 annual conference of the German Linguistic Society (DGfS) that took place at Saarland University in Saarbrücken, Germany. Available material Language Science Press, as a fully Open Access publisher, provides the following on-line material with the volume: * pdf files of each chapter and of the whole book <https://langsci-press.org/catalog/view/239/1886/1761-1> * Latex source codes <https://github.com/langsci/239>of the whole volume (on GitHub) * The bibliography <https://langsci-press.org/catalog/download/239/1887/1760-1>of the whole volume in .bib format All this comes to the readers for free! Chapters * Constituents in multiword expressions: What is their role, and why do we care? <https://langsci-press.org/catalog/view/239/1888/1762-1> Sabine Schulte im Walde & Eva Smolka * Aiming with → arrows ← at particles: Towards a conceptual analysis of directional meaning components in German particle verbs <https://langsci-press.org/catalog/view/239/1889/1763-1> Sylvia Springorum & Sabine Schulte im Walde * Do semantic features capture a syntactic classification of compounds? Insights from compositional distributional semantics <https://langsci-press.org/catalog/view/239/1890/1764-1> Sandro Pezzelle & Marco Marelli * Compositionality in English deverbal compounds: The role of the head <https://langsci-press.org/catalog/view/239/1891/1765-1> Gianina Iordăchioaia, Lonneke van der Plas & Glorianna Jagfeld * What can we learn from novel compounds? <https://langsci-press.org/catalog/view/239/1892/1766-1> Gary Libben * Internal constituent variability and semantic transparency in N Prep N constructions in Romance languages <https://langsci-press.org/catalog/view/239/1893/1767-1> Inga Hennecke * Production of multiword referential phrases: Inclusion of over-specifying information and a preference for modifier-noun phrases <https://langsci-press.org/catalog/view/239/1894/1768-1> Christina L. Gagné, Thomas L. Spalding, J. Claire Burry & Jessica Tellis Adams * Can you reach for the planets or grasp at the stars? – Modified noun, verb, or preposition constituents in idiom processing <https://langsci-press.org/catalog/view/239/1895/1769-1> Eva Smolka & Carsten Eulitz * |
From: Agata S. <aga...@un...> - 2020-03-26 09:47:49
|
The Universities of Tours and Orléans in France offer a PhD position in computational linguistics: *Design and automatic induction of a multiword expression lexicon at the service of linguistic diversity *Application deadline: 14 May 2020 (or until filled) More details can be found at: http://www.info.univ-tours.fr/ICVL/doc/jobs/2020-PhD-topic-MWE-lexicon-induction.pdf -- Agata Savary Associate Professor University of Tours 3 place Jean-Jaurès, 41029 Blois, France phone: +33 (0)2 54 55 21 47 aga...@un... http://www.info.univ-tours.fr/~savary/ PMWE book series:https://langsci-press.org/catalog/series/pmwe ICVL federation:http://www.info.univ-tours.fr/ICVL |
From: Carlos R. <car...@li...> - 2020-03-24 00:20:24
|
Dear all, We are happy to announce the release of the **training and development data** for the PARSEME shared task 1.2 <http://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_02_MWE-LEX_2020___lb__COLING__rb__&subpage=CONF_40_Shared_Task> on semi-supervised identification of verbal multiword expressions (VMWEs): https://gitlab.com/parseme/sharedtask-data/tree/master/1.2 ### Languages We provide full training and development sets for 14 languages: German (DE), Greek (EL), Basque (EU), French (FR), GA (Irish), Hebrew (HE), Hindi (HI), Italian (IT), Polish (PL), Brazilian Portuguese (PT), Romanian (RO), Swedish (SV), Turkish (TR) and Chinese (ZH). ### Annotated data We provide .cupt files <http://multiword.sourceforge.net/cupt-format> that contain VMWE annotations and morphosyntactic data. The annotation guidelines for VMWEs were slightly extended with respect to previous editions to accomodate for Chinese and Swedish phenomena, and to fix minor issues in Hindi-specific tests, leading to PARSEME guidelines 1.2 <https://parsemefr.lis-lab.fr/parseme-st-guidelines/1.2/>. The accompanying morphosyntactic information (POS tags, lemmas, morphological features and/or syntactic dependencies) uses the UD v2 scheme <http://universaldependencies.org/> (the exact version of the UD-based data depends on the language). Depending on the language, the morphosyntactic information was manually or automatically annotated. All annotations are available under open licenses, notably various flavors of the Creative Commons license. We remind you that the blind test data will be released on April 28, and the submission of system results is due on April 30. ### Additional raw corpora We also provide "raw" corpora, meant to help identify VMWEs that were unseen at training time. Here are the instructions for downloading these raw corpora <https://gitlab.com/parseme/corpora/-/wikis/Raw-corpora-for-the-PARSEME-1.2-shared-task> . The raw corpora were automatically parsed with UD v2 tools (the exact version depending on the language) and are provided in the CoNLL-U <https://universaldependencies.org/format.html> format. Their sizes vary from language to language, see the raw corpora page <https://gitlab.com/parseme/corpora/-/wikis/Raw-corpora-for-the-PARSEME-1.2-shared-task> for statistics. ### Split of the annotated data We provide a training (train), development (dev) and test sets for each language. The test set will be released later, after the evaluation phase is over. The data split was performed with a focus on unseen VMWE identification in mind. The split is random but we controlled the following factors for each language: - Test contains about 300 VMWEs which are unseen in train+dev - Dev contains about 100 VMWEs which are unseen in train - The ratio of unseen VMWEs in test with respect to train+dev (resp. dev with respect to train) is as close as possible to an average (see below for details) Unseen VMWEs are defined as in the evaluation script, that is, a VMWE in test (resp. dev) is considered unseen in train+dev (resp. train) if its multi-set of lemmas does not occur as an annotated VMWE, with the same multi-set of lemmas, in train+dev (resp. train). The ratios of unseen VMWEs vary from language to language. For most languages, the ratios of unseen VMWEs in test (with respect to train+dev) and in dev (with respect to train) are comparable, but this was not possible for languages with little data. To choose the final split, we first estimated the number of sentences in test (resp. dev) needed to provide 300 (resp. 100) unseen VWMEs in train+dev (resp. train). Then, we ran several random splits and selected one for which the unseen ratio is as close as possible to the average. ### Guidelines to participants During the system development phase, and for computing the results on the test sets, the participants are free to use train+dev in any way. In other words, the dev set can be added to the train set for machine learning purposes. In both tracks, **no data from the previous editions should be used**. The evaluation metrics <http://multiword.sourceforge.net/PHITE.php?sitesig=CONF&page=CONF_04_LAW-MWE-CxG_2018___lb__COLING__rb__&subpage=CONF_50_Evaluation_metrics> will be the same as for edition 1.1. However, for edition 1.2, the published general ranking will emphasize 3 metrics: * global MWE-based, * global token-based, * unseen MWE-based. Do not forget to register to the participants' mailing list <https://groups.google.com/forum/#!forum/verbalmwe>. We will also post the latest updates on the shared task 1.2 website <http://multiword.sourceforge.net/sharedtask2020/>. As seen from the previous PARSEME shared task editions, supervised VMWE identifiers are rather efficient for seen VMWEs, but very poor for unseen ones. We hope that this highly multilingual dataset will foster the development of systems with increased ability to identify VMWEs unseen at training time. This has been a tremendous collective effort, possible only with the strong commitment of many annotators, language leaders, organizers and technical support experts. We would like to thank all contributors for the time and enthusiasm they invested in the creation of this resource. In particular, the following people helped us by managing language-specific annotations and preparing the raw corpora: Abigail Walsh, Archna Bhatia, Chaya Liebeskind, Federico Sangati, Johanna Monti, Menghan Jiang, Hongzhi Xu, Rafael Ehren, Renata Ramisch, Sara Stymne, Timm Lichte, Tunga Güngör, Uxoa Iñurrieta, Verginica Barbu Mititelu, Voula Giouli, Zeynep Yirmibeşoğlu. All the best, Carlos Ramisch, Bruno Guillaume, Agata Savary, Jakub Waszczuk, Marie Candito and Ashwini Vaidya |