Repozytorium IFJ PAN

Ilościowe charakterystyki złożoności języka naturalnego

Pokaż uproszczony rekord

dc.contributor.advisor Kwapień, Jarosław
dc.contributor.advisor Oświęcimka, Paweł
dc.contributor.author Kulig, Andrzej
dc.date.accessioned 2017-12-20T08:26:39Z
dc.date.available 2017-12-20T08:26:39Z
dc.date.issued 2014
dc.identifier.uri http://rifj.ifj.edu.pl/handle/item/79
dc.description.abstract This doctoral dissertation includes the following main theses: - As samples of natural language, literary texts show several properties of complex systems: they have internal organization, including a hierarchical structure, and the interactions between their components such as words are of complicated nature, which among others can be a consequence of imposed rules of grammar and an author’s style of writing. One also observes formation of large-scale effects that are inexplicable on a basis of the sole knowledge of the individual words. Such effect can include content, emotional charge, and artistic value of the text. - Interactions between words defined by their mutual adjacency, after expressing them in the network representation, show certain features of networks with accelerated growth and, approximately, scale-free degree distribution of nodes. Such networks are also characterized by unique tendency to condensation, which leads to shortening of the path lengths between nodes if the number of nodes increases. - Despite strong differences in grammar, different European languages do not show comparable differences in network topology. Substantially larger differences can be seen within one language, when one compares texts that represent different literary genres. - Modelling of the empirical word adjacency networks is possible either directly, via the appropriate network models (e.g., by various kinds of the networks with accelerated growth), or indirectly, via network representation of the relevant stochastic processes. Comparing topology of the model networks with the empirical ones shows, however, that language has some subtleties, which cannot fully be expressed by relatively simple, generic models. - Literary texts, if parameterized by sentence lengths and expressed in a form of time series, show clear fractal structure, and in some cases even the multifractal structure. On the literary science ground, the latter group of texts can be linked with a narrative technique called the stream of consciousness. This dissertation is divided into 5 chapters. Chapter 1 contains a short introduction with listed the main objectives and theses of the work. Chapter 2 is devoted to description of the phenomenon of natural language - its origins, evolution, and morphology. The main theories of the language origin and formal classification of languages is also discussed in this part of the work. Chapter 3 contains an introduction to complex systems science. It begins with the explanation, why physics is a branch of science the best equipped to examine such systems and the natural language in particular. Later on, the term of complexity is introduced and the most important properties of complex systems are discussed together with the methodology allowing for their study. Chapter 4 is a container that includes description of all the analyses and discussion of the obtained results. It is composed of several sections devoted to specific issues. Section 4.1 presents a statistical analysis of empirical data consisting of vocabulary of six European languages with particular emphasis put on the Zipf approach. In Section 4.2 literary texts expressed by word adjacencies are a subject to network analysis. Of interest are the topological properties of these networks, especially the node connectivity distributions and the average shortest path lengths. Empirical results are confronted with the results of simulations according to different network models. Last Section 4.3 presents the results of the fractal analysis applied to time series of sentence lengths with the main stress put on identification of multifractal properties. Finally, Chapter 5 contains a summary with critical discussion of the results presented throughout this work, as well as an indication of possible directions of future research. pl_PL.UTF-8
dc.language.iso pol pl_PL.UTF-8
dc.publisher Institute of Nuclear Physics Polish Academy of Sciences pl_PL.UTF-8
dc.title Ilościowe charakterystyki złożoności języka naturalnego pl_PL.UTF-8
dc.type doctoralThesis pl_PL.UTF-8
dc.contributor.reviewer Burda, Zdzisław
dc.contributor.reviewer Kutner, Ryszard
dc.description.physical 138 pl_PL.UTF-8


Pliki tej pozycji

Pozycja umieszczona jest w następujących kolekcjach

Pokaż uproszczony rekord