Advanced Technologies, Systems, and Applications III - Samir Avdaković Editor - Proceedings of the International Symposium on Innovative and ...

Page created by Earl Bates
 
CONTINUE READING
Advanced Technologies, Systems, and Applications III - Samir Avdaković Editor - Proceedings of the International Symposium on Innovative and ...
Lecture Notes in Networks and Systems 59

Samir Avdaković Editor

Advanced
Technologies,
Systems, and
Applications III
Proceedings of the International
Symposium on Innovative and
Interdisciplinary Applications of
Advanced Technologies (IAT), Volume 1
Advanced Technologies, Systems, and Applications III - Samir Avdaković Editor - Proceedings of the International Symposium on Innovative and ...
Lecture Notes in Networks and Systems

Volume 59

Series editor
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
e-mail: kacprzyk@ibspan.waw.pl
Advanced Technologies, Systems, and Applications III - Samir Avdaković Editor - Proceedings of the International Symposium on Innovative and ...
The series “Lecture Notes in Networks and Systems” publishes the latest
developments in Networks and Systems—quickly, informally and with high quality.
Original research reported in proceedings and post-proceedings represents the core
of LNNS.
   Volumes published in LNNS embrace all aspects and subfields of, as well as
new challenges in, Networks and Systems.
   The series contains proceedings and edited volumes in systems and networks,
spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor
Networks, Control Systems, Energy Systems, Automotive Systems, Biological
Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems,
Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems,
Robotics, Social Systems, Economic Systems and other. Of particular value to both
the contributors and the readership are the short publication timeframe and the
world-wide distribution and exposure which enable both a wide and rapid
dissemination of research output.
   The series covers the theory, applications, and perspectives on the state of the art
and future developments relevant to systems and networks, decision making, control,
complex processes and related areas, as embedded in the fields of interdisciplinary
and applied sciences, engineering, computer science, physics, economics, social, and
life sciences, as well as the paradigms and methodologies behind them.

Advisory Board
Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of
Electrical and Computer Engineering—FEEC, University of Campinas—UNICAMP,
São Paulo, Brazil
e-mail: gomide@dca.fee.unicamp.br
Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University,
Istanbul, Turkey
e-mail: okyay.kaynak@boun.edu.tr
Derong Liu, Department of Electrical and Computer Engineering, University of Illinois
at Chicago, Chicago, USA and Institute of Automation, Chinese Academy of Sciences,
Beijing, China
e-mail: derong@uic.edu
Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta,
Alberta, Canada and Systems Research Institute, Polish Academy of Sciences, Warsaw,
Poland
e-mail: wpedrycz@ualberta.ca
Marios M. Polycarpou, KIOS Research Center for Intelligent Systems and Networks,
Department of Electrical and Computer Engineering, University of Cyprus, Nicosia, Cyprus
e-mail: mpolycar@ucy.ac.cy
Imre J. Rudas, Óbuda University, Budapest Hungary
e-mail: rudas@uni-obuda.hu
Jun Wang, Department of Computer Science, City University of Hong Kong
Kowloon, Hong Kong
e-mail: jwang.cs@cityu.edu.hk

More information about this series at http://www.springer.com/series/15179
Advanced Technologies, Systems, and Applications III - Samir Avdaković Editor - Proceedings of the International Symposium on Innovative and ...
Samir Avdaković
Editor

Advanced Technologies,
Systems, and Applications III
Proceedings of the International Symposium
on Innovative and Interdisciplinary
Applications of Advanced Technologies
(IAT), Volume 1

123
Advanced Technologies, Systems, and Applications III - Samir Avdaković Editor - Proceedings of the International Symposium on Innovative and ...
Editor
Samir Avdaković
Faculty of Electrical Engineering
University of Sarajevo
Sarajevo, Bosnia and Herzegovina

ISSN 2367-3370                      ISSN 2367-3389 (electronic)
Lecture Notes in Networks and Systems
ISBN 978-3-030-02573-1              ISBN 978-3-030-02574-8 (eBook)
https://doi.org/10.1007/978-3-030-02574-8

Library of Congress Control Number: 2016954521

© Springer Nature Switzerland AG 2019
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Advanced Technologies, Systems, and Applications III - Samir Avdaković Editor - Proceedings of the International Symposium on Innovative and ...
Contents

Applied Mathematics
Detecting Functional States of the Rat Brain with Topological
Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    3
Nianqiao Ju, Ismar Volić, and Michael Wiest
Benford’s Law and Sum Invariance Testing . . . . . . . . . . . . . . . . . . . . .                        13
Zoran Jasak
Using Partial Least Squares Structural Equation Modeling
to Predict Entrepreneurial Capacity in Transition Economies . . . . . . . .                               22
Matea Zlatković
Mathematical Modeling and Statistical Representation
of Experimental Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .          36
Amina Delić-Zimić and Fatih Destović

Advanced Electrical Power Systems (Planning, Operation
and Control)
Comparison of Different Techniques for Power System
State Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    51
Dženana Tomašević, Samir Avdaković, Zijad Bajramović,
and Izet Džananović
Fuzzy Multicriteria Decision Making Model for HPP
Alternative Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     62
Zedina Lavić and Sabina Dacić-Lepara
The Valuation of Kron Reduction Application in Load
Flow Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    70
Tarik Hubana, Sidik Hodzic, Emir Alihodzic, and Ajdin Mulaosmanovic

                                                                                                           v
Advanced Technologies, Systems, and Applications III - Samir Avdaković Editor - Proceedings of the International Symposium on Innovative and ...
vi                                                                                             Contents

Application of Artificial Neural Network and Empirical Mode
Decomposition for Predications of Hourly Values of Active
Power Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    86
Maja Muftić Dedović, Nedis Dautbašić, and Adnan Mujezinović
The Small Signal Stability Analysis of a Power System
with Wind Farms - Bosnia and Herzegovina Case Study . . . . . . . . . . . .                          98
Semir Nurković and Samir Avdaković
Classification of Distribution Network Faults Using Hilbert-Huang
Transform and Artificial Neural Network . . . . . . . . . . . . . . . . . . . . . . . 114
Tarik Hubana, Mirza Šarić, and Samir Avdaković
Distributed Generation Allocation: Objectives, Constraints
and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Mirza Šarić, Jasna Hivziefendić, and Nejdet Dogru
The Effect of Summer Months and the Profitability Assessment
of the PV Systems in Bosnia and Herzegovina . . . . . . . . . . . . . . . . . . . . 150
Faruk Bešlija and Ajla Merzić
Near Zero-Energy Home Prediction of Appliances Energy
Consumption Using the Reduced Set of Features and Random
Decision Tree Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Lejla Bandić and Jasmin Kevrić
Experience in Work of Automatic Meter Management System
in JP Elektroprivreda B&H d.d. Sarajevo, Subsidiary
“Elektrodistribucija”, Zenica . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Ahmed Mutapcic and Adnan Memic
Financial Impacts of Replacing Old Transmission Lines with
Aluminum Composite Core Conductors . . . . . . . . . . . . . . . . . . . . . . . . . 187
Semir Hadžimuratović
Energy Efficiency Evaluation of an Academic Building – Case Study:
Faculty of Electrical Engineering, University of Sarajevo . . . . . . . . . . . 198
Amna Šoše, Tatjana Konjić, and Nedis Dautbašić
Fault Identification in Electrical Power Distribution System – Case
Study of the Middle Bosnia Medium Voltage Grid . . . . . . . . . . . . . . . . 211
Jasmina Čučuković and Faruk Hidić
Implementation of Microgrid on Location Rostovo with Installation
of Sustainable Hybrid Power System (Case Study of a Real
Medium-Voltage Network) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
Fatima Mašić, Belmin Memišević, Adnan Bosović, Ajla Merzić,
and Mustafa Musić
Advanced Technologies, Systems, and Applications III - Samir Avdaković Editor - Proceedings of the International Symposium on Innovative and ...
Contents                                                                                             vii

Implementation of Protection and Control Systems
in the Transmission SS 110/10(20)/10 kV Using IEC
61850 GOOSE Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
Adnan Cokić and Admir Čeljo
Design, Optimization and Feasibility Assessment of Hybrid Power
Systems Based on Renewable Energy Resources: A Future Concept
Case Study of Remote Ski Centers in Herzegovina Region . . . . . . . . . . 255
Said Ćosić and Ajla Merzić

Power Quality
PV Plant Connection in Urban and Rural LV Grid: Comparison
of Voltage Quality Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
Ivan Ramljak and Ivana Ramljak
Monitoring of Non-ionizing Electromagnetic Fields in the Urban
Zone of Tuzla City . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Vlado Madžarević, Majda Tešanović, and Mevlida Hrustanović-Bajrić
Improving the Krnovo Wind Power Plant Efficiency by Means
of the Lithium-Ion Battery Storage System . . . . . . . . . . . . . . . . . . . . . . 289
Filip Drinčić, Saša Mujović, Martin Ćalasan, and Lazar Nikitović

Computer Modelling and Simulations for Engineering Applications
Modelling the Dephosphorization Process in a Swaying
Oxygen Converter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Damir Kahrimanovic, Erich Wimmer, Stefan Pirker, and Bernhard König
Bare Conductor Temperature Coefficient Identification by Means
of Differential Evolution Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
Mirza Sarajlić, Marko Pocajt, Peter Kitak, Nermin Sarajlić, and Jože Pihler
Preliminary Considerations on Double Diffusion Instabilities
in Two Quaternary Isothermal Systems of Biological Relevance . . . . . . 326
Berin Šeta, Josefina Gavaldà, Muris Torlak, and Xavier Ruiz
Stress Analysis of the Support for Double Motion Mechanism
Inside 420 kV 63 kA SF6 Interrupter . . . . . . . . . . . . . . . . . . . . . . . . . . 336
Džanko Hajradinović, Mahir Muratović, and Amer Smajkić
Solving Linear Wave Equation Using a Finite-Volume Method
in Time Domain on Unstructured Computational Grids . . . . . . . . . . . . 347
Muris Torlak and Vahidin Hadžiabdić
viii                                                                                                 Contents

Mechatronics, Robotics and Embedded Systems
HaBEEtat: Integrated Cloud-Based Solution for More Efficient Honey
Production and Improve Well-Being of Bee’s Population . . . . . . . . . . . . 359
Semir Šakanović and Jasmin Kevrić
PID-Controlled Laparoscopic Appendectomy Device . . . . . . . . . . . . . . . 375
Abdul Rahman Dabbour, Asif Sabanovic, and Meltem Elitaş
Radial Basis Gaussian Functions for Modelling Motor Learning
Process of Human Arm Movement in the Ballistic Task – Hit a Target                                        383
Slobodan Lubura, Dejan Ž. Jokić, and Goran S. Đorđević
An Open and Extensible Data Acquisition and Processing Platform
for Rehabilitation Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
Sehrizada Sahinovic, Amina Dzebo, Baris Can Ustundag, Edin Golubovic,
and Tarik Uzunovic

Information and Communication Technologies
Smart Home System - Remote Monitoring and Control
Using Mobile Phone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
Merisa Škrgić, Una Drakulić, and Edin Mujčić
Development of Educational Karate Games with the Help
of Scenes and Characters from the Popular Cartoon Series . . . . . . . . . . 420
Jasna Hamzabegović and Mirza Koljić
A Platform for Human-Machine Information Data Fusion . . . . . . . . . . 430
Migdat Hodžić
Soft Data Modeling via Type 2 Fuzzy Distributions for Corporate
Credit Risk Assessment in Commercial Banking . . . . . . . . . . . . . . . . . . 457
Sabina Brkić, Migdat Hodžić, and Enis Džanić
Design and Experimental Analysis of the Mobile System
Based on the Android Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
Anida Đuzelić
Last Mile at FTTH Networks: Challenges in Building Part
of the Optical Network from the Distribution Point to the Users
in Bosnia and Herzegovina . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
Anis Maslo, Mujo Hohzic, Aljo Mujcic, and Edvin Skaljo
Which Container Should I Use? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
Esmira Muslija and Edin Pjanić
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
Applied Mathematics
Detecting Functional States of the Rat Brain
                with Topological Data Analysis

                  Nianqiao Ju1(&), Ismar Volić2, and Michael Wiest3
        1
          Department of Statistics, Harvard University, Cambridge, MA 02138, USA
                                  nju@g.harvard.edu
        2
            Mathematics Department, Wellesley College, Wellesley, MA 02481, USA
                                ivolic@wellesley.edu
          3
             Neuroscience Program, Wellesley College, Wellesley, MA 02481, USA
                                mwiest@wellesley.edu

       Abstract. One of the cutting-edge methods for analyzing large sets of data
       involves looking at their “shape”, namely their geometry and topology. In this
       paper, we apply topological analysis to data arising from a neuroscience
       experiment involving multichannel voltage measurements of brain activity in
       awake rats. Data points are viewed as a point cloud, with distance defined using
       channel correlations or a Euclidean metric. Exploratory data analysis reveals that
       the topological structure defined in terms of a Euclidean metric can distinguish
       between a coherent oscillatory brain state and the desynchronized awake state,
       by associating different Betti numbers to the different brain states.

       Keywords: Topological data analysis  mu rhythm  alpha rhythm
       Rat brain  Persistent homology  Betti numbers  Local field potentials
       Spike-and-wave

1 Introduction

Multi-channel neurophysiological recordings from the brain produce rich high-
dimensional time series data from which neuroscientists attempt to distinguish different
functional states and relate them to an animal or a person’s behavioral capacities on the
one hand and to underlying neural mechanisms on the other. Our goal is to explore
whether topological data analysis, a new technique that has in recent years proved to be
extremely fruitful in many fields, including in neuroscience (see [2] for a compilation
of references), can reveal higher geometric structure in multichannel neural “local field
potential” (LFP) voltage data and ultimately reveal information about functional states
of the brain, or patterns of functional connectivity, that traditional methods cannot see.
LFPs are analogous to electroencephalographic (EEG) recordings from the scalp, in
that they reflect the electrical activities of many neurons acting in concert, but they are
“depth EEGs” recorded using electrode arrays surgically implanted into selected brain
areas to better discern the sources of neurologically important “brain waves”.
    In this paper we focus on a test case comparing the topological structure of two
known distinct states of the awake rat brain as measured by multisite LFP recordings.
One is a state which can appear in immobile but awake rats, in which the LFP at

© Springer Nature Switzerland AG 2019
S. Avdaković (Ed.): IAT 2018, LNNS 59, pp. 3–12, 2019.
https://doi.org/10.1007/978-3-030-02574-8_1
4       N. Ju et al.

multiple cortical and subcortical sites in the rat brain oscillates in a coherent high-
amplitude rhythm with a frequency around 10 Hz [4, 9, 16]. This state has been
referred to as “high voltage spike and wave discharges” [11–13, 15] or informally as
“mu rhythm” by analogy with a human brain rhythm in the same frequency range. For
brevity in this study we will refer to this brain state as mu. We will compare episodes of
this brain state to episodes of non-mu in which the brain is relatively “desynchronized”,
such that LFP fluctuations are smaller in amplitude and more broadband. Aside from
being readily distinguishable in the LFP, these brain states have been shown to cor-
respond to distinct modes of sensory processing [10].
    The goal in this work is to apply topological analysis to the mu and non-mu data in
hope that it can distinguish these states. This would support the possibility that
topology might detect more subtle patterns that relate LFPs to behavioral and cognitive
states.
    Topology studies intrinsic geometric properties of objects, namely properties of the
shape that remain unchanged after a continuous deformation. The most effective way of
measuring and comparing such properties is to look at topological invariants of the space.
A topological invariant is mathematical object, such as a polynomial or a group, that
remains unchanged after the space is deformed. One of the most basic and effective class
of invariants are homology groups. We will not define them precisely here since this is not
needed for our purposes, but will say something about them in Sect. 2. For a precise
definition, see [8] or [5]. Intuitively, homology groups keep track of the holes in a
topological space. For example, the circle S1 has a one-dimensional hole, while the sphere
S2 has a two-dimensional hole. Higher-dimensional topological objects might have
higher-dimensional holes (in fact, the k-dimensional sphere Sk has a k-dimensional hole).
    In topological data analysis, we view data as point clouds endowed with a certain
geometry that in turn gives them the structure of a topological space. The points are
intended to be thought of as finite samples taken from a geometric object, perhaps with
noise. The geometry is provided by a distance function on the data, namely a notion of a
distance between any two data points. The distance is defined using correlations between
signals recorded from different parts of the rat’s brain. From this distance function, one
builds the topological space by means of a Vietoris-Rips complex. Finally, since we now
have a topological space, we can compute its homology groups, thereby learning
something about the shape of the data cloud from the information about its holes.
    The paper is organized as follows: Some mathematical preliminaries, including
basic background on homology and the Vietoris-Rips complex, are provided in Sect. 2.
In Sect. 3 we describe the neurophysiological recording experiments and data set.
Results are presented in Sect. 4 and we summarize our conclusion in Sect. 5.

2 Mathematical Background

Informally, a homology of a topological space X is the family of homology groups

                               H0 ðXÞ; H1 ðXÞ; H2 ðXÞ; . . .                        ð2:1Þ
Detecting Functional States of the Rat Brain with Topological Data Analysis      5

    Each of them is a topological invariant that essentially counts the k-dimensional
holes in X. The first homology group, H0(X), counts the number of connected com-
ponents of the topological space, H1(X) counts the number of 1-dimensional holes,
H2(X) counts the number of 2-dimensional holes, etc. For example, the homology
groups of the circle S1 are:

                               Hn ðS1 Þ ¼ Z; for n ¼ 0; 1;
                                                                                      ð2:2Þ
                               Hn ðS1 g ¼ f0g; for n  2:

    Here Z stands for the group of integers and {0} for the trivial group. More gen-
erally, for a k-dimensional sphere Sk we have:

                            Hn ðSk Þ ¼ Z; for n ¼ 0; k;
                                                                                      ð2:3Þ
                            Hn ðSk g ¼ f0g; for all other n:

   What we mostly care about is the rank, namely the number of copies of Z, of each
homology group, since this number essentially captures all the information about the
group. The rank of the kth homology group is called the kth Betti number, denoted by

                                   bk = Rank(Hk ð X Þ):                               ð2:4Þ

    Thus bk counts the number of kth dimensional holes. If b0(X) = 1, then X consists
of a single connected component; if b1(X) = 1, then X has a single one-dimensional
hole. A way to capture the number of holes is to see how many loops there are on the
space that cannot be shrunk to a point (counting loops that can be deformed into one
another as the same). An example that illustrates this is the torus T2 = S1  S1, the
Cartesian product of two circles (a hollow doughnut). It has one connected component,
and so the 0th Betti number is 1; it has two holes because there are two essential loops
(as shown in pink and red in the left panel of Fig. 2) that cannot be shrunk to points on
the torus, so the 1-st Betti number is 2; and the space in the interior of T2 is a two
dimensional hole, so b2(T2) = 1.
    In order to make a topological space out of a data set, one first defines a notion of a
distance on it. Namely, to any two points xi and xj in the data set, we associate a non-
negative number d(xi, xj) satisfying the usual properties of a distance function, i.e. of a
metric. Then one endows the data set with the structure of a Vietoris-Rips complex, the
standard way to make a topological space out of the metric in the context of topological
data analysis. Briefly, the Vietoris-Rips complex of a data cloud X, attached to the
parameter e > 0, and denoted by VR(X,e), is the simplicial complex (a space built out of
triangles, tetrahedra, and their generalizations) whose vertex set is X and where {x1,x2,
…, xk} spans a k-simplex if and only if d(xi, xj)  e for all 0  i, j  k. For an
overview of the Vietoris-Rips complex and the idea of topological data analysis in
general, see [1] or [3]. Figure 1 illustrates the Vietoris-Rips complex of a simple data
cloud for various values of e.
6       N. Ju et al.

Fig. 1. Example of Vietoris-Rips complexes at different e (figure is taken from the Javaplex
documentation). Connected components are constructed so that data points within e of each other
belong to the same component.

    Once the data cloud has been given the structure of a topological space like this, we
can compute its homology groups Hk(X), k  0. This can be done algorithmically
through linear algebra using various online data analysis packages.
    The one used here was Javaplex [14]. Javaplex produces a persistence barcode for
each homology group, with the number of bars that “survive” being the Betti number
for that homology group. Figure 2 gives an example of the persistence barcodes for the
torus. The interpretation is that the long bars are holes in the data cloud that appear for
various values of e, i.e. they are persistent, and this means that those holes are essential
to the data cloud.

Fig. 2. A torus T2 with b0 (T2) = 1, b1 (T2) = 2 and b2 (T2) = 1. We see that the barcode plot
shows exactly these the Betti numbers. To read the Betti numbers, we count the number of
arrows in the barcode plots associated with each dimension.

    Note that all that is necessary to perform topological data analysis on a data cloud is
the metric, i.e. the distance function; the rest is essentially automatically done by a
computational tool such as Javaplex.
Detecting Functional States of the Rat Brain with Topological Data Analysis              7

3 Materials and Methods

Local field potentials (LFPs) were recorded at 16 parietal and 16 frontal sites in the
cortex of a male Long-Evans rat while the rat passively listened to 100 ms duration
tones of two different pitches, presented with equal probability in random order. The
sample rate was 1000 Hz. Trials were defined as segments of LFP from 0.5 s before
each tone until 1.5 s after the tone. For the present study to avoid confounds due to the
two pitches we only analyzed trials in which the lower pitched tone (1500 Hz) was
presented to the rat. We first rejected artifact trials automatically using a 1.5 mV
threshold.
    During the passive recording session the rat spontaneously went in and out of the
synchronized *10 Hz oscillatory state we are referring to as a mu-rhythm. Our goal is
to compare the topology of mu and non-mu trials to see whether it can capture the
difference in brain states. To identify mu and non-mu trials for the purposes of this
comparison, one of us (MCW) with experience studying this brain state selected 126
mu trials and 136 non-mu trials based on visual inspection of one frontal LFP channel.
The selected mu trials exhibited characteristic “spike-and-wave” patterns for the whole
2-second trial. Conversely, the trials selected as representative non-mu trials were free

Fig. 3. The left column shows four examples of one local field potential (LFP) channel recorded
from frontal cortex of an awake rat during episodes of an oscillatory brain state we refer to as mu.
The right column shows four example trials recorded in the same rat during the same session, but
while the rat’s brain was in a relatively desynchronized state we refer to as non-mu. In every trial
a brief tone stimulus was presented to the rat at 500 ms.
8       N. Ju et al.

of the spike-and-wave oscillation for the entire trial. This procedure resulted in a set of
126 mu trials and 136 non-mu trials. Four examples of LFP recordings in each state are
shown in Fig. 3. We chose these two brain states as a test case for our topological
analysis because they are clearly distinct in the LFP, even to an untrained eye.
    Thus the total data set comprised a

         27  126  2001 ¼ ðnumber of LFP channelsÞ  ðnumber of trialsÞ
                            ðnumber of time pointsÞ

     3-dimensional grid for the mu trials plus a 27  136  2001 grid for the non-mu
trial data. Further details about electrode implantation, recording coordinates, prepro-
cessing, and other experimental procedures may be found in [6]. All procedures
involving rats were approved by the Wellesley College Institutional Animal Care and
Use Committee.
     As a possible way to learn about functional connectivity between various parts of
the brain, we analyze the data using a persistent brain network homology. For each
trial, denote the data set as a string C = (c1, c2, …, cn) consisting of n nodes where n is
number of channels and each ci is a 2001-dimensional vector whose coordinates are the
LFPs at each ms of a 2.0 s trial. Inspired by an earlier paper [7], we calculate the
distance matrix D based on correlation between channels, defined as
                                                  qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
                                        Dij ¼      1  corrðci ; cj Þ                                              ð3:1Þ

where

                                                                      P
                                                                     2001
                     X
                     2001                                           ðci;t  ci Þðcj;t  cj Þ
                1
      ci ¼                 cit   and   corrðci ; cj Þ ¼ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
                                                              t¼1
                                                                                                                   ð3:2Þ
              2001   t¼1                                   P
                                                          2001                        P
                                                                                     2001
                                                                  ðcit  cj Þ2              ðcjt  cj Þ2
                                                                   t¼1                    t¼1

is the sample correlation between signals from the i th and j th channel. The correlation,
which is a number between −1 and 1, captures the linear relationship between the
channels. If the correlation is close to 1, this would indicate the two channels are
positively linearly related and “functionally connected”. Figure 4 gives an example of a
distance matrix for a sample trial.
     With the metric now defined, we can associate the topological space VR(C,e) to our
data, and then compute its homology using Javaplex.
     In addition to the metric described above, we also implemented the naive Euclidean
metric, treating each trial as 2001 points collected from a 27 dimensional space,
endowed with the standard Euclidean distance.
Detecting Functional States of the Rat Brain with Topological Data Analysis               9

Fig. 4. Left panel: The distance matrix D for trial No. 4 - a single trial in a session where the rat
sat passively while listening to 2 different beeps played in random order. Channels 1–15 are
frontal channels, and channels 16–30 are parietal channels. Right panel: Signals from two
channels in trial No. 4, a frontal channel #2 and a parietal channel #7. The horizontal axis shows
time in milliseconds and the vertical axis shows the LFP voltage in millivolts.

4 Results

In order to test the potential of topological data analysis for understanding multi-
channel LFP neural data, we compared the topology of mu trials, exhibiting a high-
amplitude rhythmic 10 Hz oscillation, to the topology of relatively desynchronized
non-mu trials. Examples of the two LFP states are shown in Fig. 3.
     We take Trial 4, whose distance matrix and channels #2 and #7 are shown in Fig. 4,
as an example to illustrate our correlation-based topological analysis. We obtained
b0 = 2 and b1 = 1 as the only nontrivial Betti numbers for this trial. Topologically, this
means that the data has two connected components and that one of the components has
a 1-dimensional hole, or an essential circle that cannot be shrunk within the data cloud.
Because the distance we defined arises from channel correlations, we believe the two
connected components correspond to the two brain areas - the frontal and the parietal
area.
     We first used the correlation distance to analyze all 262 trials, and examine the
resulting b0 from the two groups. Unfortunately this metric turned out to be not
revealing in distinguishing between mu and non-mu trials. We ran a Wilcoxon rank-
sum test on the b0’s from the 262 trials to test the hypothesis that the two populations
has the same distribution. This nonparametric test has a p-value of 0.0025, which
means we can reject the null hypothesis at the 95% confidence level. We also ran a
Student-t test (dof = 261) comparing the mean b0 in each group. It returned a p-value
of 0.001, supporting that the means are significantly different.
     Although these differences are statistically significant due to the large number of
trials, the differences are subtle. For example, Fig. 5 shows that knowing a trial’s b0
would not be sufficient to reliably predict whether it was a mu or non-mu trial. The
distance based on correlations reduces size of the data from 27  2001 to a 27 by 27
10      N. Ju et al.

distance matrix. Namely, the distance is summarizing all the information from time
series data with rich structures into pairwise correlations, and this is possibly one
reason why we observed only low-dimensional topological structure from the resulting
Vietoris-Rips complex. This compression of the LFP information appears to be
obscuring all the potential topological insight, and this is why we also tried the
Euclidean metric.

Fig. 5. Histogram of b0 based on the correlation-metric defined in [7]. The red bars show the
mu trials, and the zero Betti numbers have a mean b0 of 2.60 with standard deviation 0.84. The
green bars show the non-mu trials, and they have a mean b0 of 2.24 with standard deviation of
0.92. The Wilcoxon rank-sum test has p-value equal to 0.0025, and the Student-t test comparing
the two means has p-value equal to 0.001.

     With the Euclidean metric, both trials in mu and non-mu group show larger b0,
which corresponds to number of connected components in the data cloud representing a
trial. The histogram of these b0’s is shown in Fig. 6. The mu group has an average b0
of 8.40 and standard deviation 2.32. The non-mu group has an average b0 of 19.71 and
standard deviation 5.67. The Wilcoxon rank-sum test has p-value equal to 9.5  10−36,
which suggests the two populations have different distribution and that the Euclidean
metric can indeed be used as a way to detect difference in topological structures in mu
and non-mu trials. The Student-t test has p-value 1.5  10−51, so we can clearly reject
the null hypothesis of equal means. Our findings suggest that the data from the mu
trials “clusters” more, in the sense that it forms fewer separate connected components.
     We also calculated b1 for each trial, which is number of essential holes in the point-
cloud data. Unfortunately this is not as illuminating as the b0 data in terms of detecting
mu trials: 19 out of 126 mu trials have b1 equal to 1 and, for the non-mu trials, 2 out of
136 have b1 = 1 and one has b1 = 2.
Detecting Functional States of the Rat Brain with Topological Data Analysis        11

Fig. 6. Histogram of b0 based on Euclidean distance. The red bars show the mu trials, with
mean 8.40 and standard deviation 2.33. The green bars show the non-mu trials, with mean 19.71
and standard deviation 5.67. The Wilcoxon rank-sum test comparing the two groups has a
p-value of 9.5  10−36, which mean we can safely reject the hypothesis that the two populations
are from the same distribution. The Student-t test comparing the means returns the p-value of
1.5  10−51.

5 Conclusion

In order to test whether a topological analysis can capture differences between distinct
brain states as measured by LFPs in awake rats, we compared Betti numbers for
segments of multichannel LFP data recorded during an oscillatory “mu” state and a
relatively desynchronized “non-mu” state. A Euclidean-based analysis found Betti-zero
numbers in the mu state less than half their values in the non-mu state (Fig. 6),
reflecting greater clustering of the data cloud in the non-mu state, and supporting that
topological analysis can detect functional states of the brain in multichannel LFP data.
    In the future, we would like to apply topological data analysis to more sessions and
explore other metrics for defining simplicial complexes. It will be interesting to see
whether the Betti numbers can capture more subtle functional differences in brain state
than those we examined in this study, and whether higher-order Betti numbers can also
be useful for distinguishing functional brain states.

Acknowledgments. The authors would like to thank the Wellesley College Science Center
Summer Research Program and the Brachman-Hoffman Fellowship. Ismar Volić would also like
to thank the Simons Foundation for its support. Michael Wiest’s work was supported by National
Science Foundation Integrative Organismal Systems grants 1121689 and 1353571.
12       N. Ju et al.

References
 1. Carlsson, G.: Topology and data. Bull. Am. Math. Soc. 46, 255–308 (2009)
 2. Curto, C.: What can topology tell us about the neural code? Bull. Am. Math. Soc. 54, 63–78
    (2016)
 3. Edelsbrunner, H., Harer, J.: Computational Topology: An Introduction. American Mathe-
    matical Society, Providence (2009)
 4. Fontanini, A., Katz, D.B.: 7 to 12 Hz activity in rat gustatory cortex reflects disengagement
    from a fluid self-administration task. J. Neurophysiol. 93, 2832–2840 (2005)
 5. Hatcher, A.: Algebraic Topology. Cambridge University Press, Cambridge (2001)
 6. Imada, A., Morris, A., Wiest, M.: Deviance detection by a P3-like response in rat posterior
    parietal cortex. Front. Integr. Neurosci. 6, 127 (2013)
 7. Khalid, A., Kim, B.S., Chung, M.K., Ye, J.C., Jeon, D.: Tracing the evolution of multi-scale
    functional networks in a mouse model of depression using persistent brain network
    homology. NeuroImage. 101, 351–363 (2014)
 8. Munkres, J.: Topology, 2nd edn. Pearson, London (2000)
 9. Nicolelis, M.A., Baccala, L.A., Lin, R.C., Chapin, J.K.: Sensorimotor encoding by
    synchronous neural ensemble activity at multiple levels of the somatosensory system.
    Science 268, 1353–1358 (1995)
10. Nicolelis, M.A., Fanselow, E.E.: Thalamocortical [correction of Thalamcortical] optimiza-
    tion of tactile processing according to behavioral state. Nat. Neurosci. 5, 517–523 (2002)
11. Polack, P.O., Charpier, S.: Intracellular activity of cortical and thalamic neurons during high-
    voltage rhythmic spike discharge in Long-Evans rats in vivo. J. Physiol. 571, 461–476
    (2006)
12. Rodgers, K.M., Dudek, F.E., Barth, D.S.: Progressive, seizure-like, spike-wave discharges
    are common in both injured and uninjured sprague-dawley rats: implications for the fluid
    percussion injury model of post-traumatic epilepsy. J. Neurosci. 35, 9194–9204 (2015)
13. Shaw, F.Z.: 7–12 Hz high-voltage rhythmic spike discharges in rats evaluated by
    antiepileptic drugs and flicker stimulation. J. Neurophysiol. 97, 238–247 (2007)
14. Tausz, A., Vejdemo-Johansson, M., Adams, H.: JavaPlex: a research software package for
    persistent (Co)homology. Software (2011). http://code.google.com/javaplex
15. Vergnes, M., Marescaux, C., Depaulis, A., Micheletti, G., Warter, J.M.: Spontaneous spike
    and wave discharges in thalamus and cortex in a rat model of genetic petit mal-like seizures.
    Exp. Neurol. 96, 127–136 (1987)
16. Wiest, M.C., Nicolelis, M.A.: Behavioral detection of tactile stimuli during 7–12 Hz cortical
    oscillations in awake rats. Nat. Neurosci. 6, 913–914 (2003)
Benford’s Law and Sum Invariance Testing

                                     Zoran Jasak(&)

                    NLB Banka d.d., Sarajevo, Bosnia and Herzegovina
                              zoran.jasak@nlb.ba

       Abstract. Benford’s law is logarithmic law for distribution of leading digits
       formulated by P[D = d] = log(1 + 1/d) where d is leading digit or group of
       digits. It’s named by Frank Albert Benford (1938) who formulated mathematical
       model of this probability. Before him, the same observation was made by Simon
       Newcomb. This law has changed usual preasumption of equal probability of
       each digit on each position in number. One of main characteristic properties of
       this law is sum invariance. Sum invariance means that sums of significand are
       the same for any leading digit or group of digits. Term ‘significand’ is used
       instead of term ‘mantissa’ to avoid terminological confusion with logarithmic
       mantissa.

1 Introduction

In article Note on the Frequency of use of different digits in natural numbers (Am J
Math 4(1):39–40, 1881) Simon Newcomb asserted That the ten digits do not occur with
equal frequency must be evident to any one making much use of logarithmic tables, and
noticing how much faster the first pages wear out than the last ones. The first sig-
nificant figure is oftener 1 than any other digit, and the frequency diminishes up to 9.
Newcomb did not give mathematical explanation of this observation, just relative
frequencies which were verified later [1].
    The same phenomenon was re-discovered by Benford (1938) [2] who gave the
mathematical formulation:
                                                      
                                                     1
                               P½D ¼ d  ¼ log10 1 þ                                     ð1Þ
                                                     d

    This law is presented on Fig. 1.
    He named this phenomenon by “Law of Anomalous number” because he asserted
that “…An analysis of the numbers from different sources shows that the numbers
taken from unrelated subjects, such as a group of newspaper items, show a much better
agreement with a logarithmic distribution than do numbers from mathematical tabu-
lations or other formal data. There is here the peculiar fact that numbers that indi-
vidually are without relationship are, when considered in large groups, in good
agreement with a distribution law”.
    For a long time this was treated just as curiosity. This law is a theoretical challenge
from many theoretical and practical aspects and considered as unsolved problem [3].

© Springer Nature Switzerland AG 2019
S. Avdaković (Ed.): IAT 2018, LNNS 59, pp. 13–21, 2019.
https://doi.org/10.1007/978-3-030-02574-8_2
14      Z. Jasak

                     Fig. 1. Probabilities of leading digits for base 10

    It’s difficult to find an area in which this law cannot be applied. One of the most
frequent use of this law is fraud detection. Basic premise is that is difficult to simulate
numbers in ordinary unmanipulated processes which follows Benford’s law exactly.
    Exponential form of real number x in base B is:

                                  x ¼ S ð x Þ  bm ; m 2 Z                            ð2Þ

    The original word used to describe the coefficient Sð xÞ of floating-point numbers is
mantissa. This usage remains common in computing and among computer scientists.
However, this use of the word mantissa is discouraged by the IEEE floating point
standard committee and by some professionals such as W. Kahan and D. Knuth
because it conflicts with the pre-existing usage of mantissa for the fractional part of a
logarithm. New term is significand.
    Formal definition of significand is formulated by Berger and Hill [4].
Definition. The (decimal) significand function S : R ! ½1; 10Þ is defined as follows: if
x 6¼ 0 then Sð xÞ ¼ t where t is the unique number in ½1; 10Þ with j xj ¼ 10k t for some
(necessarily) unique k 2 Z; if x ¼ 0 then Sð xÞ ¼ 0.

2 Invariances

One of the most interesting properties of Benford’s law are base, scale and sum
invariance.
Benford’s Law and Sum Invariance Testing       15

    Base invariance means that the probabilities of leading digits have logarithmic law
in any base b  2. Mathematical formulaton of this property is:
                                                          
                                               log 1 þ 1
                                             1           d
                      P½D1 ¼ djb ¼ logb 1 þ    ¼                                       ð3Þ
                                             d      logb

   In Fig. 2 the theoretical probabilities for bases 2 to 10 are presented.

                   Fig. 2. Probabilities of leading digits for bases 2 to 10

     Hill proved [5] that base invariance implies Benford’s law.
     Scale invariance means that probabilities of leading digits have the same proba-
bilities if whole sample is multiplied by one positive number a 6¼ bk ; k 2 Z. This
property has practical importance. It’s possible to check, for example, do numerical
values come from (un)manipulated source.
     Hill [5] proved that scale invariance implies base invariance.
     Idea of sum invariance is presented by Mark Nigrini who asserted in his Ph.D
thesis (1992) that tables of unmanipulated accounting data closely follow Benford’s
Law and that sufficient long list of data for which BL holds the sum of all entries with
leading digit d is constant for various d. Nigrini, in his book [6], calculated integral for
a  rx1 between leading digits ft and ft þ 1; result doesn’t depend of leading digits and
he concluded that sum must be equal.
     Extension of this observation can be stated for k-tuples of leading digits, which is
called sum invariance property of Benford’s law.
     Formal definition of sum invariance is given by Berger and Hill [4, p. 61].
16       Z. Jasak

Definition. A sequence fxn g of real numbers has sum invariant significant digits if, for
every m 2 N, the limit
                                        PN
                                         n¼1   Sd1 ;...;dm ðxn Þ
                                  lim
                                 N!1            N
    exists and is independent of d1 ; . . .; dm .
    Here Sd1 ;...;dm ðxn Þ is significand with d1 ; . . .; dm as a leading digits.
    Analytical tools relying on Benford’s law are primarly oriented to analysis of
frequencies. Sum invariance is interesting from practical point of view because it can
be very efficient additional tool in all such analyses.
    Some facts are important [7]:
– Significands of numbers in tables, not numbers themselves, must be added.
  Otherwise, single astronomically large number in a table would dominate all other
  sums;
– word ‘constant’ in Nigrini’s statement can be translated to be ‘constant in
  expectation’.
     Pieter C. Allart proved theorem of this empirical observation.
Theorem. A probability measure P on (R+, B) is sum-invariant if and only if its
corresponding significand distribution PS is Benford’s law.

3 Results and Discussion

In my research idea is to investigate sum invariance not only for leading digits but for
any k-tuple of consecutive digits inside the number and to propose testing method.
    Sum invariance property can be extended for second, third, … digit. In another
words, in sample which follows Benford’s law sums of significands having same digits
(or group of digits) on the same positions are the same. There is no limitations on
leading digits only. Another interpretation is that sum of significands for first digit d is
1=9 of total sum of significands in sample.
    Null hypothesis is: H0: Sum of significands for groups of consecutive digits are the
same.
    Main problem in testing is to estimate expected sums of significands.
    If we have sample in size of N elements, theoretical frequency for every digit d is
given by
                                                             
                                                        1
                                nd ¼ N  log10       1þ                               ð4Þ
                                                        d

   Sum invariance means that there is number Td which is the sum of significands
beginning with digit:
Benford’s Law and Sum Invariance Testing       17

                                              X
                                              nd
                                       Td ¼         Si ðd Þ
                                              i¼1

   Here Si ðd Þ is the significand of i-th numerical value having d as leading digit.
Dividing this relation by nd we have the average significand (arithmetic mean) for
group of significands, denoted by Sðd Þ. This is analogue of the actual mean defined by
Dumas and Devine [8, p. 16]:

                                           1X
                                   AM ¼       Xcollapsed
                                           N
where Xcollapsed is defined by

                                                    10  X
                                   Xcollapsed ¼
                                                  10intðlog10 X Þ

    With accuracy of five digits the smallest and the biggest average significands for
numbers beginning by digit 9 are 9.00000 and 9.99999 respectively. It’s possible from
this, by reccurence, to get smallest and biggest significands for other leading digits,
denoted by Smin and Smax in Table 1. The same calculation can be conducted for groups
of leading digits of any size.

   Table 1. Theoretical minimal, average and maximal significands for one leading digit
                           Digit    S_Min     Average         S_Max
                           1        1.36803   1.44270         1.52003
                           2        2.33866   2.46630         2.59891
                           3        3.29615   3.47606         3.66239
                           4        4.24948   4.48142         4.72164
                           5        5.20095   5.48481         5.77882
                           6        6.15141   6.48716         6.83490
                           7        7.10129   7.48888         7.89031
                           8        8.05077   8.49019         8.94530
                           9        9.00000   9.49122         9.99999

    Sum invariance is based on one interesting property of logaritmic curve [10]. If
interval ½1; 10Þ is divided in subintervals of equal size, areas of curvilinear rectangle
bounded by lines:
                                                                  
                                             1                    1
          y ¼ log10 x; x ¼ 0; l3 ¼ log10 1 þ    ; l4 ¼ log10 1 þ       ;
                                             d                   dþ1

are equal. Lines l3 and l4 are Benford’s probabilities and d are digits 1, 2, …, 9. Next
theorem is very important [10].
18      Z. Jasak

Theorem. A probabilistic measure P for Benford’s law is sum invariant if and only if
[Bk−1, Bk) is divided on n subintervals of equal size.
    Digits 1 to 9 are one of ways in which interval [1,9) can be divided on subintervals
of equal size. We can do it with any other interval [Bk−1, Bk), where B is base.
    Natural idea for sum invariance is to use average significands. They can be easely
calculated by [10]:

                                             log10 e
                                     x¼               
                                          log10 1 þ d1

    Where d is digit 1, 2, …, 9. This formula we got by use of mean value theorem for
logarithmic curve on intervals [d, d + 1). Average significands for leading digits are in
Table 1. It can be easly verified that average significands are harmonic averages of
minimal and maximal significands.
    Theoretical sum of significands is proposed by use of formula [10]:
                                                           !1
                                               X9
                                                    1
                              T1 ¼ 9  N 
                                               d¼1 Sðd Þ

    Main reason for such proposal is that is not regular to use arithmetic but harmonic
means. Adequacy of such approach is verified in [10].
    This formula means that the sum of significands for one leading digit T1 =9 can be
found if the sample size is multiplied by the harmonic mean of average significands,
denoted here by Sðd Þ.
    Expected sum of significands having the same leading digits can be found if we
multiply average significand for this group by number of such significands. This for-
mula is for 9 leding digits but it’s can be easy extended for 90 two first digits, 10 s
digits, 100 digits on the second and third position etc. By use of maximal and minimal

                       Table 2. Calculation of values from sample
                      Dig   Counts   Sam_Per   Sums              Av_Sig
                      1     4.047    0,35234   5.577,88735       1,37828
                      2     1.747    0,15210   4.019,78036       2,30096
                      3     1.222    0,10639   4.047,61105       3,31228
                      4     997      0,08680   4.282,58452       4,29547
                      5     921      0,08018   4.765,26876       5,17402
                      6     721      0,06277   4.520,78251       6,27016
                      7     623      0,05424   4.531,63929       7,27390
                      8     639      0,05563   5.326,60358       8,33584
                      9     569      0,04954   5.309,84965       9,33190
Benford’s Law and Sum Invariance Testing         19

average significands from Table 2 we have lower and upper limit for sums. Same
formula is used to calculate sums of significands, which is needed for testing purposes.
   Expected sum of significands having the same digits on second position is 9/10 of
sum on first position, namely [10]:

                                                            9
                                                 T2 ¼          T1
                                                           10

   In this way we have adequate tools for testing of sum invariance property.

4 Sum Invariance Testing

Main goal for practicioners is to test sum invariance property. In other words, it’s task
is to investigate if there is any discrepancy between theoretical and sample sums of
significands.
    My proposal is to use f-divergence for testing sum invariance property. Divergence
measures play an important role in statistical theory, especially in large sample theories
of estimation and testing [9]. The underlying reason is that they are indices of statistical
distance between probability distributions P and Q; the smaller these indices are, the
harder is to discriminate between P and Q. Many divergence measures have been
proposed since the publication of the paper of Kullback and Leibler [12].
    In order to conduct a unified study of statistical properties of divergence measuers,
Salicru, Morales, Prado and Menendez [9] proposed a generalized divergence which
includes as particular cases other divergence measures. They proposed unified
expression, called ðh; ;Þ-divergence, as follows [9]:
                               Z         Z                               
            h                                                      fh1 ðxÞ
           D; ðh1 ; h2 Þ   ¼        ha        f h 2 ð x Þ  ;a               dlð xÞ  ;a ð1Þ dgðaÞ   ð5Þ
                               K          X                       fh2 ð xÞ

where h ¼ ðha Þa2K , ; ¼ ð;a Þa2K , ;a and ha are real valued C2 functions with ha ð0Þ ¼ 0
and g is r-finite measure on the measurable space ðK; bÞ.
    Let X be a random variable denoting the quotient between sum of significands for
one leading digit and total sum of significands so we test the hypothesis that X has a
uniform discrete distribution with probabilities 1=9. Test statistic derived from (5) is
[11]:
                                                2                              !2    3
                                                           X
                                                           9
                                   T2 ¼ 36  49                     pbi 0;5         15
                                                            i¼1

    This statistic is used for first leading digits. Here pbi denotes sample quotient
between sum of significands for one digit and total sum of digits. Analogue statistic are
derived for first two digits and for second digits [10].
    This statistic, for n = 9 digits, has v28 -distribution, what is described in [9], with
appropriate statistical tables.
20      Z. Jasak

    Advantage of this procedure is additivity of statistic T2 . We can make choice of
groups of digits we want to test if we want intentionally exclude some digits or we are
dealing with process which produces numbers with specific leading digits. The only
condition is that we need at least two different digits in our sample.
    This method is demonstrated on a sample of size of 11,486 elements. Minimal
sample value is 10, maximal value is 176,932.50, average is 3,606.00, standard
deviation is 7,793.29, total sum of all values is 41,418,526.12. All calculations are
made on a = 0.05 significance level. Table 2 presents these calculations.
    In column DIG are leading digits, in column COUNTS are sample frequencies for
every digit, in column SAM_PER are sample relative frequencies, in column SUMS are
sample sums of significands for every digit and in column AV_SIG are average sig-
nificands for every digit. Total sample sums of significands are 42382.00706 for first
digits.
    In Table 3 calculation of test statistic is presented.

                          Table 3. Calculation of test statistic
                        Rat_Th    Rat_Sa    pi*qi     Sqrt(AE)
                        0,11111   0,13161   0,01462   0,12092688
                        0,11111   0,09485   0,01054   0,10265714
                        0,11111   0,09550   0,01061   0,10301189
                        0,11111   0,10105   0,01123   0,10595976
                        0,11111   0,11244   0,01249   0,11177166
                        0,11111   0,10667   0,01185   0,10886663
                        0,11111   0,10692   0,01188   0,10899728
                        0,11111   0,12568   0,01396   0,11817162
                        0,11111   0,12529   0,01392   0,11798563

     In column RAT_TH are quotients of theoretical sums for leading digits and total
theoretical sum of digits. As it’s expected, all quotients are 1/9. In column RAT_SA are
quotients of sample sums for leading digits and total sample sum of digits. In column
pi  qi are products of quotients RAT_TH and RAT_SA. In next column, SQRT(*), are
square roots of product pi  qi .
     Value of statistics T2 in this case is T2 ¼ 0:11920. Critical region corresponds to
probability
                                    h             i
                                   P jT2 j  v2a;8 ¼ a
                                              2

    For a = 0.05 we have intervals (0; 2.1797307) and (17.53454614; +∞). According
to this we have no reason to accept hypothesis. It means that sums of significands in
sample are not equaly distributed.
Benford’s Law and Sum Invariance Testing            21

5 Conclusions

In this text testing of sum invariance of Benford’s law is presented. My proposal is to
use average significands and additional method for calculating of expected sums of
significands. Using of f-divergence as a test procedure has some big advantages like
additivity proerty.

Acknowledgments. I wish to thank to Mr. Wilhelm Schappacher for great support in my work.

References
 1. Newcomb, S.: Note on the frequency of use of different digits in natural numbers. Am.
    J. Math. 4, 39–40 (1881)
 2. Benford, F.A.: The law of anomalous numbers. Proc. Am. Philos. Soc. 78, 551–572 (1938)
 3. Strauch, O.: Unsolved problems. Tatra Mt. Math. Publ. 56(3), 175–178 (2013)
 4. Berger, A., Hill, T.P.: Theory of Benford’s Law. Probab. Surv. 8, 1–126 (2011). https://doi.
    org/10.1214/11-ps175. ISSN 1549-5787
 5. Hill, T.P.: Base invariance implies Benford’s Law. Proc. Am. Math. Soc. 123(3), 887–895
    (1995)
 6. Nigrini, M.: Forensic Analytics – Methods and Techniques for Forensic Accounting
    Investigations, pp. 144–146. Wiley, Hoboken (2011)
 7. Allart, P.C.: A Sum-invariant Charcterization of Benford’s Law. AMS (1990)
 8. Dumas, C., Devine, J.S.: Detecting evidence of non-compliance in self-reported pollution
    emissions data: an application of Benford’s law. Selected Paper American Agricultural
    Economics Association Annual Meeting Tampa, Fl, 30 July–2 August 2000
 9. Salicru, M., Morales, D., Menendez, M.L., Pardo, L.: On the Application of Divergence
    Type Measures in Testing Statistical Hypotheses. J. Multivar. Anal. 51, 372–391 (1994)
10. Jasak, Z.: Sum invariance testing and some new properties of Benford’s law, Doctorial
    dissertation. University in Tuzla, Bosnia and Herzegovina (2017)
11. Jasak, Z.: Benford’s law and invariances. J. Math. Syst. Sci. 1(1), 1–6 (2011). (Serial No.1).
    ISSN 2159-5291
12. Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86
    (1951)
Using Partial Least Squares Structural
       Equation Modeling to Predict Entrepreneurial
            Capacity in Transition Economies

                                    Matea Zlatković(&)

               Faculty of Economics, University of Banja Luka, Banja Luka,
                                Bosnia and Herzegovina
                         matea.zlatkovic@ef.unibl.org

       Abstract. Many theoretical and empirical studies indicate the significant
       influence of environmental challenges and characteristics on entrepreneur-
       ship. Drawing insights from this research, this paper defines the structural model
       to analyze synergistic influences of certain elements of Entrepreneurial Factor
       Conditions on the entrepreneurial capacity in Slovenia and Bosnia and Herze-
       govina. The analyzed structural model consists of three environmental dimen-
       sions – entrepreneurial education and training, cultural and social norms and
       research and development, and higher-order construct entrepreneurial capacity
       as a final target dependent variable. Partial Least Squares Structural Equation
       Modeling analyzed relationships between chosen variables. The obtained results
       indicate the highest significance of the cultural and social norms of entrepre-
       neurial capacity in both countries. Entrepreneurial education and training does
       not have the direct effect on entrepreneurial capacity in factor-driven Bosnia and
       Herzegovina’s economy which suggests that education programs are insuffi-
       ciently extended with necessary tools for starting and managing the new busi-
       ness. Research and development has an important role in entrepreneurial
       capacity in both countries because as it yields innovation as a generator of ideas
       for new business and technological changes creating new opportunities for
       entrepreneurship activities.

1 Introduction

Countries of varying degrees of development differ in terms of overall social, political
and cultural trends reflected in the entrepreneurial behavior of the population as well as
on the scale and structure of entrepreneurial endeavors. The level of economic
development directly influences entrepreneurial conditions and the environment as the
basic preconditions of entrepreneurial behavior. In addition to the personal traits, skills
and motivations of individuals, entrepreneurial behavior depends on the availability of
entrepreneurial capital, government programs, and policies, physical infrastructure,
entrepreneurship education etc.
    The conceptual model of the entrepreneurial environment presented in the Global
Entrepreneurship Monitor (GEM) is in all segments supported by the views of the
classical Austrian economic school. The model encompasses general national condi-
tions affecting business activities such as institutions, macroeconomic stability,

© Springer Nature Switzerland AG 2019
S. Avdaković (Ed.): IAT 2018, LNNS 59, pp. 22–35, 2019.
https://doi.org/10.1007/978-3-030-02574-8_3
You can also read