{"id":29399,"date":"2024-05-29T14:30:29","date_gmt":"2024-05-29T12:30:29","guid":{"rendered":"https:\/\/opi-test.opi.org.pl\/?page_id=26117"},"modified":"2025-03-12T07:31:02","modified_gmt":"2025-03-12T06:31:02","slug":"open-access","status":"publish","type":"page","link":"https:\/\/opi.org.pl\/en\/what-we-do\/science-and-research\/science-for-everyone\/open-access\/","title":{"rendered":"Open access"},"content":{"rendered":"\n<h1 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-Dzielimysi\u0119zTob\u0105naszymiosi\u0105gni\u0119ciaminaukowymi\">We share our scientific achievements<\/h1>\n\n\n\n<p><strong><strong><strong>To ensure easy access to OPI PIB&#8217;s scientific resources for the public, the institute has implemented an open access policy for publications and research data,&nbsp;which enables all users to explore OPI PIB&#8217;s discoveries.<\/strong><\/strong><\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" src=\"https:\/\/opi.org.pl\/wp-content\/uploads\/2024\/06\/otwarty-dostep-2x.jpg\" alt=\"\" class=\"wp-image-26791\" style=\"aspect-ratio:16\/9;object-fit:cover\"\/><\/figure>\n\n\n\n<p>OPI PIB supports the development of a society that benefits from scientific findings and the latest technological advancements. We share the results of our work for free.<\/p>\n\n\n\n<p>We believe that our approach contributes to the fostering of innovation in Poland.<br><\/p>\n\n\n\n<div class=\"wp-block-group opi-tlo4 opi-wide space-margin-top-1\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<h2 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-G\u0142\u00f3wnezasadypolitykiotwartegodost\u0119pu\"><strong><strong>Key principles of the open access&nbsp;policy<\/strong><\/strong><\/h2>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:25%\">\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-Publikacje\" style=\"font-size:28px\">Publications<\/h3>\n\n\n\n<p>We want our publications to be accessible to the public. The majority of our publications are available free of charge in the \u2018<a href=\"https:\/\/opi.org.pl\/en\/what-we-do\/science-and-research\/science-for-everyone\/publications\/\" data-type=\"page\" data-id=\"26711\">Publications<\/a>\u2019 tab.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:25%\">\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-Wydawnictwo\" style=\"font-size:28px\">Publishing house<\/h3>\n\n\n\n<p>OPI PIB publishes monographs and other hard-copy and electronic publications, which are available on our \u2018<a href=\"https:\/\/opi.org.pl\/en\/what-we-do\/science-and-research\/science-for-everyone\/publishing-house\/\" data-type=\"link\" data-id=\"https:\/\/opi.org.pl\/en\/what-we-do\/science-and-research\/science-for-everyone\/publishing-house\/\">Publishing house<\/a>\u2019 page.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:25%\">\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-Danebadawcze\" style=\"font-size:28px\">Research data&nbsp;<\/h3>\n\n\n\n<p>We guarantee open access to research data and other related metadata by:&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>defining data usage rules<\/li>\n\n\n\n<li>storing data in an electronic research repository<\/li>\n\n\n\n<li>ensuring public access to data in line with the FAIR rules<\/li>\n\n\n\n<li>making agreements with research team members and other data creators.<\/li>\n<\/ul>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:25%\">\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-Zasobynaukowe\" style=\"font-size:28px\">Scientific resources&nbsp;<\/h3>\n\n\n\n<p>OPI PIB\u2019s scientific resources are also available in peer-reviewed journals, scholarly books and open repositories.<\/p>\n<\/div>\n<\/div>\n<\/div><\/div>\n\n\n\n<h2 class=\"wp-block-heading space-margin-top-2\" id=\"Otwartydost\u0119p-Gdzieznajdziesznaszedaneinarz\u0119dzia?\"><strong><strong>Where to find our tools and data<\/strong><\/strong><\/h2>\n\n\n\n<div class=\"wp-block-media-text has-media-on-the-right is-stacked-on-mobile opi-page-media-text\"><div class=\"wp-block-media-text__content\">\n<h2 class=\"wp-block-heading\">Machine learning models<\/h2>\n\n\n\n<p>Our experts have developed neural language models that are now available to all software developers. Neural language models enable internet users to access machine translation and chatbot services.<\/p>\n\n\n\n<p>Although most neural language models are designed specifically for the English language, experts at OPI PIB have also made Polish models available free of charge to the public.&nbsp;Models such as QRA are adapted to understand the Polish language and to generate text in Polish.<\/p>\n\n\n\n<p>Download them now and use them to your advantage.<\/p>\n\n\n\n<p>The models are available at&nbsp;<a href=\"https:\/\/github.com\/OPI-PIB\/ml-models\" class=\"external\" rel=\"nofollow\" target=\"blank\">GitHub OPI PIB<\/a>.<\/p>\n<\/div><figure class=\"wp-block-media-text__media\"><img decoding=\"async\" src=\"https:\/\/opi.org.pl\/wp-content\/uploads\/2024\/07\/Group-11124-2x.jpg\" alt=\"\" class=\"wp-image-29272 size-full\"\/><\/figure><\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-Neuronowemodelej\u0119zyka\">Neural language models<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-Qra\">Qra<\/h3>\n\n\n\n<p>AI Lab and the Gda\u0144sk University of Technology have developed Polish generative neural language models that rely on the Llama2 model and have been trained on one terabyte of exclusively Polish text data. Qra is the first modern generative model to be pretrained on such a large Polish text corpus. There are three distinct Qra models, each varying in complexity: Qra 1B, Qra 7B and Qra 13B. Qra 7B and Qra 13B achieve significantly better perplexity results than the original Llama-2 models, demonstrating superior capabilities in modelling the comprehension, lexis and grammar of the Polish language.<\/p>\n\n\n\n<p>Download:&nbsp;<a href=\"https:\/\/huggingface.co\/OPI-PG\" class=\"external\" rel=\"nofollow\" target=\"blank\">Qra<\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-RoBERTa\">RoBERTa<\/h3>\n\n\n\n<p>A set of Polish neural language models that rely on the Transformer architecture and are trained using masked language modelling (MLM) and the techniques described in&nbsp;<a href=\"https:\/\/arxiv.org\/abs\/1907.11692\" class=\"external\" rel=\"nofollow\" target=\"blank\">RoBERTa: A Robustly Optimized BERT Pretraining Approach<\/a>. Two sizes of model are available: base and large. The base models are neural networks with approximately 100 million parameters; the large models contain 350 million. The large models offer higher prediction quality in practical use, but require more computational resources. Large Polish text corpora (20-200 GB) were used to train the models. &nbsp;Each model comes in two variants, which makes them compatible with popular machine learning libraries&nbsp;<a href=\"https:\/\/github.com\/pytorch\/fairseq\" class=\"external\" rel=\"nofollow\" target=\"blank\">Fairseq<\/a>&nbsp;and&nbsp;<a href=\"https:\/\/github.com\/huggingface\/transformers\" class=\"external\" rel=\"nofollow\" target=\"blank\">Hugginface Transformers<\/a>.<\/p>\n\n\n\n<p>Fairseq models:&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/YammFDDFyymxHjA\" class=\"external\" rel=\"nofollow\" target=\"blank\">base (version 1)<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/X78QyWBXmbTmWTr\" class=\"external\" rel=\"nofollow\" target=\"blank\">base (version 2)<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/TBM8q5Bzrqaa5XF\" class=\"external\" rel=\"nofollow\" target=\"blank\">large (version 1)<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/zwK4mofafDtgBx2\" class=\"external\" rel=\"nofollow\" target=\"blank\">large (version 2)<\/a><\/p>\n\n\n\n<p>Huggingface Transformers models:&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/j9A9Fmij6smDTe8\" class=\"external\" rel=\"nofollow\" target=\"blank\">base (version 1)<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/JonE4qDDjzsQAtT\" class=\"external\" rel=\"nofollow\" target=\"blank\">base (version 2)<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/RAmxCTKDNY4naWe\" class=\"external\" rel=\"nofollow\" target=\"blank\">large (version 1)<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/FTpq7ceAgdeyR5k\" class=\"external\" rel=\"nofollow\" target=\"blank\">large (version 2)<\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-BART\">BART<\/h3>\n\n\n\n<p>A Transformer neural language model that utilises an encoder-decoder architecture. BART was trained on a set of Polish documents of over 200 GB using the method described in&nbsp;<a href=\"https:\/\/arxiv.org\/abs\/1910.13461\" class=\"external\" rel=\"nofollow\" target=\"blank\">BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension<\/a>. The model can be adapted to solving predictive tasks, but is designed to be used primarily in sequence-to-sequence tasks in which documents (for example, from machine translation and chatbots) serve both as the input and the output. The model comes in two variants, which makes them compatible with popular machine learning libraries&nbsp;<a href=\"https:\/\/github.com\/pytorch\/fairseq\" class=\"external\" rel=\"nofollow\" target=\"blank\">Fairseq<\/a>&nbsp;and&nbsp;<a href=\"https:\/\/github.com\/huggingface\/transformers\" class=\"external\" rel=\"nofollow\" target=\"blank\">Hugginface Transformers<\/a>.<\/p>\n\n\n\n<p>Download:&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/aw6o2g7joKS8m6D\" class=\"external\" rel=\"nofollow\" target=\"blank\">Fairseq model<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/nHPT3Ln7SBRyb5M\" class=\"external\" rel=\"nofollow\" target=\"blank\">Huggingface Transformers model<\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-GPT-2\">GPT-2<\/h3>\n\n\n\n<p>A neural language model that is based on the Transformer architecture and trained using the autoregressive language model method. The neural network architecture complies with the English GPT-2 models described in&nbsp;<a href=\"https:\/\/d4mucfpksywv.cloudfront.net\/better-language-models\/language_models_are_unsupervised_multitask_learners.pdf\" class=\"external\" rel=\"nofollow\" target=\"blank\">Language Models are Unsupervised Multitask Learners<\/a>. OPI PIB offers the model in two sizes: medium, which contains approximately 350 million parameters, and large, which contains approximately 700 million parameters. The files are compatible with the&nbsp;<a href=\"https:\/\/github.com\/pytorch\/fairseq\" class=\"external\" rel=\"nofollow\" target=\"blank\">Fairseq<\/a>&nbsp;library.<\/p>\n\n\n\n<p>Download:&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/9p32SjLsASgepqz\" class=\"external\" rel=\"nofollow\" target=\"blank\">medium model<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/TGXs2CytKnTbjNx\" class=\"external\" rel=\"nofollow\" target=\"blank\">large model<\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-ELMo\">ELMo<\/h3>\n\n\n\n<p>A language model that is based on the long short-term memory (LSTM) recurrent neural networks presented in&nbsp;<a href=\"https:\/\/arxiv.org\/abs\/1802.05365\" class=\"external\" rel=\"nofollow\" target=\"blank\">Deep contextualized word representations<\/a>. The Polish language model is compatible with the&nbsp;<a href=\"https:\/\/github.com\/allenai\/allennlp\" class=\"external\" rel=\"nofollow\" target=\"blank\">AllenNLP<\/a>&nbsp;library.<\/p>\n\n\n\n<p>Download:&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/KrKRTytyQp7yka9\" class=\"external\" rel=\"nofollow\" target=\"blank\">model<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-Statycznereprezentacjes\u0142\u00f3w\">Static representations of words<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-Word2Vec\">Word2Vec<\/h3>\n\n\n\n<p>Classic vector representations of words for the Polish language trained using the method described in&nbsp;<a href=\"https:\/\/arxiv.org\/abs\/1310.4546\" class=\"external\" rel=\"nofollow\" target=\"blank\">Distributed Representations of Words and Phrases and their Compositionality<\/a>. A large corpus of Polish-language documents was used to train the vectors. The set contains approximately 2 million words\u2014including the ones that appear at least three times in the corpus\u2014and other defined symbol categories, such as punctuation marks, numbers from 0 to 10,000, and Polish forenames and surnames. The vectors are compatible with the&nbsp;<a href=\"https:\/\/radimrehurek.com\/gensim\/\" class=\"external\" rel=\"nofollow\" target=\"blank\">Gensim<\/a>&nbsp;library. The vectors offered by OPI PIB range from 100-dimensional to 800-dimensional.<\/p>\n\n\n\n<p>Download:&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/w7eTXQWeAJXX8tP\" class=\"external\" rel=\"nofollow\" target=\"blank\">100d<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/PnZD2Yck3jQT4ye\" class=\"external\" rel=\"nofollow\" target=\"blank\">300d<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/NMQXAjbi3yx7gZL\" class=\"external\" rel=\"nofollow\" target=\"blank\">500d<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/QTz8Jt2gbMmtnkx\" class=\"external\" rel=\"nofollow\" target=\"blank\">800d<\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-GloVe\">GloVe<\/h3>\n\n\n\n<p>Vector representations of words for the Polish language that have been trained using the&nbsp;<a href=\"https:\/\/aclanthology.org\/D14-1162\/\" class=\"external\" rel=\"nofollow\" target=\"blank\">GloVe<\/a>&nbsp;method developed at Stanford University. A large corpus of Polish-language documents was used to train the vectors. The set contains approximately 2 million words\u2014including the ones that appear at least three times in the corpus\u2014and other defined symbol categories, such as punctuation marks, numbers from 0 to 10,000, and Polish forenames and surnames. The vectors are saved in a text format that is compatible with various libraries designed for this type of model. The vectors offered by OPI PIB range from 100-dimensional to 800-dimensional.<\/p>\n\n\n\n<p>Download:&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/qeWtsizPZxJZXCY\" class=\"external\" rel=\"nofollow\" target=\"blank\">100d<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/kzWtFTTWAnNnmS4\" class=\"external\" rel=\"nofollow\" target=\"blank\">300d<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/TEernXTfFco2EXt\" class=\"external\" rel=\"nofollow\" target=\"blank\">500d<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/MQ4LisDdagX5DWL\" class=\"external\" rel=\"nofollow\" target=\"blank\">800d<\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-FastText\">FastText<\/h3>\n\n\n\n<p>A model that contains vector representations of words and word parts in the Polish language. Unlike traditional, static representations of languages, the model is capable of generating new vectors for the words that are not included in dictionaries, based on the sum of representations of parts of such words. The model was trained on a large corpus of Polish-language documents using the method described in&nbsp;<a href=\"https:\/\/arxiv.org\/abs\/1607.04606\" class=\"external\" rel=\"nofollow\" target=\"blank\">Enriching Word Vectors with Subword Information<\/a>. The set contains approximately 2 million words\u2014including the ones that appear at least three times in the corpus\u2014and other defined symbol categories, such as punctuation marks, numbers from 0 to 10,000, and Polish forenames and surnames. The vectors are compatible with the&nbsp;<a href=\"https:\/\/radimrehurek.com\/gensim\/\" class=\"external\" rel=\"nofollow\" target=\"blank\">Gensim<\/a>&nbsp;library. The vectors offered by OPI PIB range from 100-dimensional to 800-dimensional.<\/p>\n\n\n\n<p>Download:&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/JGwNPApL4NH2Lza\" class=\"external\" rel=\"nofollow\" target=\"blank\">100d<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/5cGH7xMiJg3FzEW\" class=\"external\" rel=\"nofollow\" target=\"blank\">300d<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/kgMqjCL7WM3zQ62\" class=\"external\" rel=\"nofollow\" target=\"blank\">500d<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/o2e37A6KsZ4odtd\" class=\"external\" rel=\"nofollow\" target=\"blank\">800d (part 1)<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/a6926zpKPLy9Bq7\" class=\"external\" rel=\"nofollow\" target=\"blank\">800d (part 2)<\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-Modelet\u0142umaczeniamaszynowego\">Machine translation models<\/h3>\n\n\n\n<p>Polish-English and English-Polish models based on convolutional networks. The models are used in the machine translation of documents contained in the&nbsp;<a href=\"https:\/\/github.com\/pytorch\/fairseq\" class=\"external\" rel=\"nofollow\" target=\"blank\">Fairseq<\/a>&nbsp;library. They are based on convolutional neural networks. OPI PIB offers two models: Polish-English and English-Polish. They were trained&nbsp;on the data available on the&nbsp;<a href=\"http:\/\/opus.nlpl.eu\/\" class=\"external\" rel=\"nofollow\" target=\"blank\">OPUS<\/a>&nbsp;website, which comprises a set of 40 million pairs of source and target language sentences.<\/p>\n\n\n\n<p>Download:&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/ztGPz7q7aHk4CfH\" class=\"external\" rel=\"nofollow\" target=\"blank\">Polish-English model<\/a>,&nbsp;<a href=\"https:\/\/share.opi.org.pl\/s\/GTW5n4KdiyFcaAq\" class=\"external\" rel=\"nofollow\" target=\"blank\">English-Polish model<\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-Modeledowykrywaniasymptom\u00f3wdepresji\">Models for detecting signs of depression<\/h3>\n\n\n\n<p>The models are part of the winning solution in&nbsp;<a href=\"https:\/\/competitions.codalab.org\/competitions\/36410\" class=\"external\" rel=\"nofollow\" target=\"blank\">the Shared Task on Detecting Signs of Depression from Social Media Text<\/a>&nbsp;competition, which was organised at the&nbsp;<a href=\"https:\/\/sites.google.com\/view\/lt-edi-2022\/home\" class=\"external\" rel=\"nofollow\" target=\"blank\">LT-EDI-ACL2022<\/a>&nbsp;conference. Competitors were tasked with creating a system capable of determining three levels of user depression (no depression, moderate depression and severe depression), based on their social media posts in English. OPI PIB&#8217;s&nbsp;solution consisted of three models: two classification models and the DepRoBERTa (RoBERTa for Depression Detection) language model. DepRoBERTa was prepared using a corpus of approximately 400,000 Reddit posts, mainly concerning depression, anxiety and suicidal thoughts. The models are compatible with the popular&nbsp;<a href=\"https:\/\/github.com\/huggingface\/transformers\" class=\"external\" rel=\"nofollow\" target=\"blank\">Hugginface Transformers<\/a>&nbsp;machine learning library. For more information on the competition and on OPI PIB&#8217;s solution, see&nbsp;<a href=\"https:\/\/aclanthology.org\/2022.ltedi-1.40\/\" class=\"external\" rel=\"nofollow\" target=\"blank\">OPI@LT-EDI-ACL2022: Detecting Signs of Depression from Social Media Text using RoBERTa Pre-trained Language Models<\/a>.<\/p>\n\n\n\n<p>Models:&nbsp;<a href=\"https:\/\/huggingface.co\/rafalposwiata\/deproberta-large-v1\" class=\"external\" rel=\"nofollow\" target=\"blank\">DepRoBERTa<\/a>,&nbsp;<a href=\"https:\/\/huggingface.co\/rafalposwiata\/roberta-large-depression\" class=\"external\" rel=\"nofollow\" target=\"blank\">roberta-depression-detection<\/a>,&nbsp;<a href=\"https:\/\/huggingface.co\/rafalposwiata\/deproberta-large-depression\" class=\"external\" rel=\"nofollow\" target=\"blank\">deproberta-depression-detection<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading space-margin-top-2\" id=\"Otwartydost\u0119p-Narz\u0119dziedoprzetwarzaniaj\u0119zykanaturalnego\">Natural language processing toolkit&nbsp;<\/h2>\n\n\n\n<p>We invite programmers to use our natural language processing toolkit. Discover the&nbsp;<strong>OPI PIB Toolkit for NLP<\/strong>.<\/p>\n\n\n\n<p>The toolkit relies on REST API and integrates four language models. OPI PIB&#8217;s API enables its users to train and test their own programmes based on natural language processing (NLP) solutions.<\/p>\n\n\n\n<p>The tookit is simple, compact and ready to use. It saves users time that would otherwise be required to configure multiple language models. Users can use preset components to create their own, more advanced solutions and applications quickly and seamlessly.<\/p>\n\n\n\n<div class=\"wp-block-group opi-page-box-highlight\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<h3 class=\"wp-block-heading\">The OPI PIB Toolkit for NLP is:<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>multilingual\u2014users can analyse documents written in Polish, English, German or French<\/li>\n\n\n\n<li>ready to use\u2014users can prototype and develop their own solutions<\/li>\n\n\n\n<li>compact\u2014users can spend more time solving real problems instead of configuring and implementing basic NLP functionalities.<\/li>\n<\/ul>\n\n\n\n<p>The OPI PIB Toolkit for NLP is available at the&nbsp;<a href=\"https:\/\/inventorum.opi.org.pl\/inventorum-process-nlp-web\/doc\/\" class=\"external\" rel=\"nofollow\" target=\"blank\">Inventorum website<\/a>.<\/p>\n<\/div><\/div>\n\n\n\n<h2 class=\"wp-block-heading space-margin-top-2\" id=\"Otwartydost\u0119p-Zbiorydanychnaukowych\">Scientific datasets<\/h2>\n\n\n\n<p>OPI PIB believes that the development of Polish science is paramount. That is why the institute has made its scientific datasets available to all researchers. Scholars publish and give access to raw source and partially processed data that lays the groundwork for future research work. The data pertains to various OPI PIB research projects.<\/p>\n\n\n\n<div class=\"wp-block-media-text has-media-on-the-right is-stacked-on-mobile opi-page-media-text\"><div class=\"wp-block-media-text__content\">\n<h2 class=\"wp-block-heading\">Downloadable data<\/h2>\n\n\n\n<p style=\"font-size:20px\">We have made the following data available:&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list opi-ul-downloads\">\n<li><span>data on information extraction from HTML documents <a href=\"https:\/\/zenodo.org\/record\/1212605#.YoUoX-hBxD8\" data-type=\"link\" data-id=\"https:\/\/zenodo.org\/record\/1212605#.YoUoX-hBxD8\" class=\"external\" rel=\"nofollow\" target=\"blank\">Download [6.9 MB]<\/a><\/span><\/li>\n\n\n\n<li><span>data on information extraction from emergency and firefighting reports <a href=\"https:\/\/zenodo.org\/record\/885436#.YoUoYuhBxD8\" data-type=\"link\" data-id=\"https:\/\/zenodo.org\/record\/885436#.YoUoYuhBxD8\" class=\"external\" rel=\"nofollow\" target=\"blank\">Download [298.7 kB]<\/a><\/span><\/li>\n\n\n\n<li><span>data on the results on classification of commercial websites via various machine learning methods to identify innovative firms <a href=\"https:\/\/zenodo.org\/record\/2537998#.YoUoa-hBxD8\" data-type=\"link\" data-id=\"https:\/\/zenodo.org\/record\/2537998#.YoUoa-hBxD8\" class=\"external\" rel=\"nofollow\" target=\"blank\">Download [983.1 kB]<\/a><\/span><\/li>\n\n\n\n<li><span>data on publications on classification of text documents <a href=\"https:\/\/zenodo.org\/record\/1207374#.YoUoZ-hBxD8\" data-type=\"link\" data-id=\"https:\/\/zenodo.org\/record\/1207374#.YoUoZ-hBxD8\" class=\"external\" rel=\"nofollow\" target=\"blank\">Download [187.7 kB]<\/a><\/span><\/li>\n\n\n\n<li><span>database of mpMRI scans for prostate cancer diagnosis. <a href=\"https:\/\/ai4ar.opi.org.pl\/baza-obrazow-mpmri\" data-type=\"link\" data-id=\"https:\/\/ai4ar.opi.org.pl\/baza-obrazow-mpmri\" class=\"external\" rel=\"nofollow\" target=\"blank\">Download [74,1 GB]<\/a><\/span><\/li>\n<\/ul>\n<\/div><figure class=\"wp-block-media-text__media\"><img decoding=\"async\" src=\"https:\/\/opi.org.pl\/wp-content\/uploads\/2024\/07\/Group-11160-2x-1200x887.jpg\" alt=\"\" class=\"wp-image-29273 size-full\"\/><\/figure><\/div>\n\n\n\n<div class=\"wp-block-media-text is-stacked-on-mobile opi-page-media-text\"><figure class=\"wp-block-media-text__media\"><img decoding=\"async\" src=\"https:\/\/opi.org.pl\/wp-content\/uploads\/2024\/07\/Mask-Group-11131-2x-1200x818.jpg\" alt=\"\" class=\"wp-image-29278 size-full\"\/><\/figure><div class=\"wp-block-media-text__content\">\n<h2 class=\"wp-block-heading\" id=\"Otwartydost\u0119p-AplikacjaVRHomeafterWar\"><strong><em>Home after War<\/em>&nbsp;in VR<\/strong><\/h2>\n\n\n\n<p><em>Home after War<\/em>&nbsp;is a free VR application that is available on the Oculus Store. In it, you are introduced to Ahmaid, who has experienced violence in the Middle East at the hands of ISIS.<\/p>\n\n\n\n<p>Ahmaid shows you around his home, whose details have been meticulously recreated on the basis of scans of his real-life home. The owner tells his story and explains the consequences of his return after the defeat of ISIS.<\/p>\n\n\n\n<p>The Polish language version of&nbsp;<em>Home after War<\/em>&nbsp;was prepared by OPI PIB with the intention of immersing Polish-speaking users in this unique experience. The release also helps experts at OPI PIB to conduct research on the impact of VR on empathy.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link wp-element-button external\" href=\"https:\/\/www.meta.com\/pl-pl\/experiences\/2900834523285203\/\" rel=\"nofollow\" target=\"blank\">Read more<\/a><\/div>\n<\/div>\n\n\n\n<p><\/p>\n<\/div><\/div>\n\n\n\n<h2 class=\"wp-block-heading space-margin-top-2\" id=\"Otwartydost\u0119p-Kod\u017ar\u00f3d\u0142owyNavoica\">The NAVOICA education platform source code<\/h2>\n\n\n\n<p>OPI PIB supports lifelong learning. The institute has made the NAVOICA platform source code available to the public. NAVOICA is a learning management system platform that is used to develop scalable MOOC e-learning websites that enable the creation and implementation of courses for any number of participants in an asynchronous model.<\/p>\n\n\n\n<div class=\"wp-block-group opi-page-box-highlight opi-page-box-highlight-medium\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<p>NAVOICA is a modified version of the Open edX platform. The tool is popular both in Poland and abroad. We hope that making the source code available to the public will help in the creation of new professional educational platforms.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>Download:&nbsp;<a href=\"https:\/\/github.com\/OPI-PIB\/navoica-platform\" class=\"external\" rel=\"nofollow\" target=\"blank\">GitHub \u2013 OPI-PIB\/navoica-platform<\/a><\/p>\n<\/div><\/div>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-left is-layout-flex wp-container-core-buttons-is-layout-fdcfc74e wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button button-reverse2\"><a class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/opi.org.pl\/wp-content\/uploads\/2023\/03\/Zalacznik-nr-1-do-Zarzadzenia-nr-17-2022-sig.pdf\">Read more about our&nbsp;Open Access Policy<\/a><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>We share our scientific achievements To ensure easy access to OPI PIB&#8217;s scientific resources for the public, the institute has implemented an open access policy for publications and research data,&nbsp;which enables all users to explore OPI PIB&#8217;s discoveries. OPI PIB supports the development of a society that benefits from scientific findings and the latest technological&hellip;<\/p>\n","protected":false},"author":34,"featured_media":0,"parent":29397,"menu_order":100,"comment_status":"closed","ping_status":"closed","template":"templates\/page-without-title.php","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"class_list":["post-29399","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/opi.org.pl\/en\/wp-json\/wp\/v2\/pages\/29399","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/opi.org.pl\/en\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/opi.org.pl\/en\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/opi.org.pl\/en\/wp-json\/wp\/v2\/users\/34"}],"replies":[{"embeddable":true,"href":"https:\/\/opi.org.pl\/en\/wp-json\/wp\/v2\/comments?post=29399"}],"version-history":[{"count":3,"href":"https:\/\/opi.org.pl\/en\/wp-json\/wp\/v2\/pages\/29399\/revisions"}],"predecessor-version":[{"id":32520,"href":"https:\/\/opi.org.pl\/en\/wp-json\/wp\/v2\/pages\/29399\/revisions\/32520"}],"up":[{"embeddable":true,"href":"https:\/\/opi.org.pl\/en\/wp-json\/wp\/v2\/pages\/29397"}],"wp:attachment":[{"href":"https:\/\/opi.org.pl\/en\/wp-json\/wp\/v2\/media?parent=29399"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}