Repository logo
 

Ilościowe charakterystyki złożoności języka naturalnego

dc.contributor.advisorKwapień, Jarosław
dc.contributor.advisorOświęcimka, Paweł
dc.contributor.authorKulig, Andrzej
dc.contributor.reviewerBurda, Zdzisław
dc.contributor.reviewerKutner, Ryszard
dc.date.accessioned2017-12-20T08:26:39Z
dc.date.available2017-12-20T08:26:39Z
dc.date.issued2014
dc.description.abstractThis doctoral dissertation includes the following main theses: - As samples of natural language, literary texts show several properties of complex systems: they have internal organization, including a hierarchical structure, and the interactions between their components such as words are of complicated nature, which among others can be a consequence of imposed rules of grammar and an author’s style of writing. One also observes formation of large-scale effects that are inexplicable on a basis of the sole knowledge of the individual words. Such effect can include content, emotional charge, and artistic value of the text. - Interactions between words defined by their mutual adjacency, after expressing them in the network representation, show certain features of networks with accelerated growth and, approximately, scale-free degree distribution of nodes. Such networks are also characterized by unique tendency to condensation, which leads to shortening of the path lengths between nodes if the number of nodes increases. - Despite strong differences in grammar, different European languages do not show comparable differences in network topology. Substantially larger differences can be seen within one language, when one compares texts that represent different literary genres. - Modelling of the empirical word adjacency networks is possible either directly, via the appropriate network models (e.g., by various kinds of the networks with accelerated growth), or indirectly, via network representation of the relevant stochastic processes. Comparing topology of the model networks with the empirical ones shows, however, that language has some subtleties, which cannot fully be expressed by relatively simple, generic models. - Literary texts, if parameterized by sentence lengths and expressed in a form of time series, show clear fractal structure, and in some cases even the multifractal structure. On the literary science ground, the latter group of texts can be linked with a narrative technique called the stream of consciousness. This dissertation is divided into 5 chapters. Chapter 1 contains a short introduction with listed the main objectives and theses of the work. Chapter 2 is devoted to description of the phenomenon of natural language - its origins, evolution, and morphology. The main theories of the language origin and formal classification of languages is also discussed in this part of the work. Chapter 3 contains an introduction to complex systems science. It begins with the explanation, why physics is a branch of science the best equipped to examine such systems and the natural language in particular. Later on, the term of complexity is introduced and the most important properties of complex systems are discussed together with the methodology allowing for their study. Chapter 4 is a container that includes description of all the analyses and discussion of the obtained results. It is composed of several sections devoted to specific issues. Section 4.1 presents a statistical analysis of empirical data consisting of vocabulary of six European languages with particular emphasis put on the Zipf approach. In Section 4.2 literary texts expressed by word adjacencies are a subject to network analysis. Of interest are the topological properties of these networks, especially the node connectivity distributions and the average shortest path lengths. Empirical results are confronted with the results of simulations according to different network models. Last Section 4.3 presents the results of the fractal analysis applied to time series of sentence lengths with the main stress put on identification of multifractal properties. Finally, Chapter 5 contains a summary with critical discussion of the results presented throughout this work, as well as an indication of possible directions of future research.pl_PL.UTF-8
dc.description.physical138pl_PL.UTF-8
dc.identifier.urihttp://rifj.ifj.edu.pl/handle/item/79
dc.language.isopolpl_PL.UTF-8
dc.publisherInstitute of Nuclear Physics Polish Academy of Sciencespl_PL.UTF-8
dc.titleIlościowe charakterystyki złożoności języka naturalnegopl_PL.UTF-8
dc.typedoctoralThesispl_PL.UTF-8

Files

Original bundle
Loading...
Thumbnail Image
Name:
rozpr_Kulig.pdf
Size:
22.49 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Loading...
Thumbnail Image
Name:
license.txt
Size:
846 B
Format:
Item-specific license agreed upon to submission
Description: