tab-dataset.tab_dataset.dataset

The dataset module is part of the tab-dataset package.

It contains the classes DatasetAnalysis, Cdataset for Dataset entities.

For more information, see the user guide or the github repository.

   1# -*- coding: utf-8 -*-
   2"""
   3The `dataset` module is part of the `tab-dataset` package.
   4
   5It contains the classes `DatasetAnalysis`, `Cdataset` for Dataset entities.
   6
   7For more information, see the 
   8[user guide](https://loco-philippe.github.io/tab-dataset/docs/user_guide.html) 
   9or the [github repository](https://github.com/loco-philippe/tab-dataset).
  10"""
  11from collections import Counter
  12from copy import copy
  13import math
  14import json
  15import csv
  16
  17
  18from tab_dataset.cfield import Cutil
  19from tab_dataset.dataset_interface import DatasetInterface
  20from tab_dataset.field import Nfield, Sfield
  21from tab_dataset.cdataset import Cdataset, DatasetError
  22
  23FILTER = '$filter'
  24
  25class Sdataset(DatasetInterface, Cdataset):
  26    # %% intro
  27    '''
  28    `Sdataset` is a child class of Cdataset where internal value can be different
  29    from external value (list is converted in tuple and dict in json-object).
  30    
  31    One attribute is added: 'field' to define the 'field' class.
  32
  33    The methods defined in this class are :
  34
  35    *constructor (@classmethod)*
  36
  37    - `Sdataset.from_csv`
  38    - `Sdataset.from_file`
  39    - `Sdataset.merge`
  40    - `Sdataset.ext`
  41    - `Cdataset.ntv`
  42    - `Cdataset.from_ntv`
  43
  44    *dynamic value - module analysis (getters @property)*
  45
  46    - `DatasetAnalysis.analysis`
  47    - `DatasetAnalysis.anafields`
  48    - `Sdataset.extidx`
  49    - `Sdataset.extidxext`
  50    - `DatasetAnalysis.field_partition`
  51    - `Sdataset.idxname`
  52    - `Sdataset.idxlen`
  53    - `Sdataset.iidx`
  54    - `Sdataset.lenidx`
  55    - `Sdataset.lidx`
  56    - `Sdataset.lidxrow`
  57    - `Sdataset.lisvar`
  58    - `Sdataset.lvar`
  59    - `DatasetAnalysis.lvarname`
  60    - `Sdataset.lvarrow`
  61    - `Cdataset.lunicname`
  62    - `Cdataset.lunicrow`
  63    - `DatasetAnalysis.partitions`
  64    - `DatasetAnalysis.primaryname`
  65    - `DatasetAnalysis.relation`
  66    - `DatasetAnalysis.secondaryname`
  67    - `Sdataset.setidx`
  68    - `Sdataset.zip`
  69
  70    *dynamic value (getters @property)*
  71
  72    - `Cdataset.keys`
  73    - `Cdataset.iindex`
  74    - `Cdataset.indexlen`
  75    - `Cdataset.lenindex`
  76    - `Cdataset.lname`
  77    - `Cdataset.tiindex`
  78
  79    *global value (getters @property)*
  80
  81    - `DatasetAnalysis.complete`
  82    - `Sdataset.consistent`
  83    - `DatasetAnalysis.dimension`
  84    - `Sdataset.primary`
  85    - `Sdataset.secondary`
  86
  87    *selecting - infos methods*
  88
  89    - `Sdataset.idxrecord`
  90    - `DatasetAnalysis.indexinfos`
  91    - `DatasetAnalysis.indicator`
  92    - `Sdataset.iscanonorder`
  93    - `Sdataset.isinrecord`
  94    - `Sdataset.keytoval`
  95    - `Sdataset.loc`
  96    - `Cdataset.nindex`
  97    - `Sdataset.record`
  98    - `Sdataset.recidx`
  99    - `Sdataset.recvar`
 100    - `Cdataset.to_analysis`
 101    - `DatasetAnalysis.tree`
 102    - `Sdataset.valtokey`
 103
 104    *add - update methods*
 105
 106    - `Cdataset.add`
 107    - `Sdataset.addindex`
 108    - `Sdataset.append`
 109    - `Cdataset.delindex`
 110    - `Sdataset.delrecord`
 111    - `Sdataset.orindex`
 112    - `Cdataset.renameindex`
 113    - `Cdataset.setname`
 114    - `Sdataset.updateindex`
 115
 116    *structure management - methods*
 117
 118    - `Sdataset.applyfilter`
 119    - `Cdataset.check_relation`
 120    - `Cdataset.check_relationship`
 121    - `Sdataset.coupling`
 122    - `Sdataset.full`
 123    - `Sdataset.getduplicates`
 124    - `Sdataset.mix`
 125    - `Sdataset.merging`
 126    - `Cdataset.reindex`
 127    - `Cdataset.reorder`
 128    - `Sdataset.setfilter`
 129    - `Sdataset.sort`
 130    - `Cdataset.swapindex`
 131    - `Sdataset.setcanonorder`
 132    - `Sdataset.tostdcodec`
 133
 134    *exports methods (`observation.dataset_interface.DatasetInterface`)*
 135
 136    - `Dataset.json`
 137    - `Dataset.plot`
 138    - `Dataset.to_obj`
 139    - `Dataset.to_csv`
 140    - `Dataset.to_dataframe`
 141    - `Dataset.to_file`
 142    - `Dataset.to_ntv`
 143    - `Dataset.to_obj`
 144    - `Dataset.to_xarray`
 145    - `Dataset.view`
 146    - `Dataset.vlist`
 147    - `Dataset.voxel`
 148    '''
 149
 150    field_class = Sfield
 151
 152    def __init__(self, listidx=None, name=None, reindex=True):
 153        '''
 154        Dataset constructor.
 155
 156        *Parameters*
 157
 158        - **listidx** :  list (default None) - list of Field data
 159        - **name** :  string (default None) - name of the dataset
 160        - **reindex** : boolean (default True) - if True, default codec for each Field'''
 161
 162        self.field = self.field_class
 163        Cdataset.__init__(self, listidx, name, reindex=reindex)
 164
 165    @classmethod
 166    def from_csv(cls, filename='dataset.csv', header=True, nrow=None, decode_str=True,
 167                 decode_json=True, optcsv={'quoting': csv.QUOTE_NONNUMERIC}):
 168        '''
 169        Dataset constructor (from a csv file). Each column represents index values.
 170
 171        *Parameters*
 172
 173        - **filename** : string (default 'dataset.csv'), name of the file to read
 174        - **header** : boolean (default True). If True, the first raw is dedicated to names
 175        - **nrow** : integer (default None). Number of row. If None, all the row else nrow
 176        - **optcsv** : dict (default : quoting) - see csv.reader options'''
 177        if not optcsv:
 178            optcsv = {}
 179        if not nrow:
 180            nrow = -1
 181        with open(filename, newline='', encoding="utf-8") as file:
 182            reader = csv.reader(file, **optcsv)
 183            irow = 0
 184            for row in reader:
 185                if irow == nrow:
 186                    break
 187                if irow == 0:
 188                    idxval = [[] for i in range(len(row))]
 189                    idxname = [''] * len(row)
 190                if irow == 0 and header:
 191                    idxname = row
 192                else:
 193                    for i in range(len(row)):
 194                        if decode_json:
 195                            try:
 196                                idxval[i].append(json.loads(row[i]))
 197                            except:
 198                                idxval[i].append(row[i])
 199                        else:
 200                            idxval[i].append(row[i])
 201                irow += 1
 202        lindex = [cls.field_class.from_ntv(
 203            {name: idx}, decode_str=decode_str) for idx, name in zip(idxval, idxname)]
 204        return cls(listidx=lindex, reindex=True)
 205
 206    @classmethod
 207    def from_file(cls, filename, forcestring=False, reindex=True, decode_str=False):
 208        '''
 209        Generate Object from file storage.
 210
 211         *Parameters*
 212
 213        - **filename** : string - file name (with path)
 214        - **forcestring** : boolean (default False) - if True,
 215        forces the UTF-8 data format, else the format is calculated
 216        - **reindex** : boolean (default True) - if True, default codec for each Field
 217        - **decode_str**: boolean (default False) - if True, string are loaded in json data
 218
 219        *Returns* : new Object'''
 220        with open(filename, 'rb') as file:
 221            btype = file.read(1)
 222        if btype == bytes('[', 'UTF-8') or btype == bytes('{', 'UTF-8') or forcestring:
 223            with open(filename, 'r', newline='', encoding="utf-8") as file:
 224                bjson = file.read()
 225        else:
 226            with open(filename, 'rb') as file:
 227                bjson = file.read()
 228        return cls.from_ntv(bjson, reindex=reindex, decode_str=decode_str)
 229
 230    def merge(self, fillvalue=math.nan, reindex=False, simplename=False):
 231        '''
 232        Merge method replaces Dataset objects included into its constituents.
 233
 234        *Parameters*
 235
 236        - **fillvalue** : object (default nan) - value used for the additional data
 237        - **reindex** : boolean (default False) - if True, set default codec after transformation
 238        - **simplename** : boolean (default False) - if True, new Field name are
 239        the same as merged Field name else it is a composed name.
 240
 241        *Returns*: merged Dataset '''
 242        ilc = copy(self)
 243        delname = []
 244        row = ilc[0]
 245        if not isinstance(row, list):
 246            row = [row]
 247        merged, oldname, newname = self.__class__._mergerecord(
 248            self.ext(row, ilc.lname), simplename=simplename, fillvalue=fillvalue,
 249            reindex=reindex)
 250        delname.append(oldname)
 251        for ind in range(1, len(ilc)):
 252            oldidx = ilc.nindex(oldname)
 253            for name in newname:
 254                ilc.addindex(self.field(oldidx.codec, name, oldidx.keys))
 255            row = ilc[ind]
 256            if not isinstance(row, list):
 257                row = [row]
 258            rec, oldname, newname = self.__class__._mergerecord(
 259                self.ext(row, ilc.lname), simplename=simplename)
 260            if oldname and newname != [oldname]:
 261                delname.append(oldname)
 262            for name in newname:
 263                oldidx = merged.nindex(oldname)
 264                fillval = self.field.s_to_i(fillvalue)
 265                merged.addindex(
 266                    self.field([fillval] * len(merged), name, oldidx.keys))
 267            merged += rec
 268        for name in set(delname):
 269            if name:
 270                merged.delindex(name)
 271        if reindex:
 272            merged.reindex()
 273        ilc.lindex = merged.lindex
 274        return ilc
 275
 276    @classmethod
 277    def ext(cls, idxval=None, idxname=None, reindex=True, fast=False):
 278        '''
 279        Dataset constructor (external index).
 280
 281        *Parameters*
 282
 283        - **idxval** : list of Field or list of values (see data model)
 284        - **idxname** : list of string (default None) - list of Field name (see data model)'''
 285        if idxval is None:
 286            idxval = []
 287        if not isinstance(idxval, list):
 288            return None
 289        val = []
 290        for idx in idxval:
 291            if not isinstance(idx, list):
 292                val.append([idx])
 293            else:
 294                val.append(idx)
 295        lenval = [len(idx) for idx in val]
 296        if lenval and max(lenval) != min(lenval):
 297            raise DatasetError('the length of Iindex are different')
 298        length = lenval[0] if lenval else 0
 299        idxname = [None] * len(val) if idxname is None else idxname
 300        for ind, name in enumerate(idxname):
 301            if name is None or name == '$default':
 302                idxname[ind] = 'i'+str(ind)
 303        lindex = [cls.field_class(codec, name, lendefault=length, reindex=reindex,
 304                                  fast=fast) for codec, name in zip(val, idxname)]
 305        return cls(lindex, reindex=False)
 306
 307# %% internal
 308    @staticmethod
 309    def _mergerecord(rec, mergeidx=True, updateidx=True, simplename=False, 
 310                     fillvalue=math.nan, reindex=False):
 311        row = rec[0]
 312        if not isinstance(row, list):
 313            row = [row]
 314        var = -1
 315        for ind, val in enumerate(row):
 316            if val.__class__.__name__ in ['Sdataset', 'Ndataset']:
 317                var = ind
 318                break
 319        if var < 0:
 320            return (rec, None, [])
 321        #ilis = row[var]
 322        ilis = row[var].merge(simplename=simplename, fillvalue=fillvalue, reindex=reindex)
 323        oldname = rec.lname[var]
 324        if ilis.lname == ['i0']:
 325            newname = [oldname]
 326            ilis.setname(newname)
 327        elif not simplename:
 328            newname = [oldname + '_' + name for name in ilis.lname]
 329            ilis.setname(newname)
 330        else:
 331            newname = copy(ilis.lname)
 332        for name in rec.lname:
 333            if name in newname:
 334                newname.remove(name)
 335            else:
 336                updidx = name in ilis.lname and not updateidx
 337                #ilis.addindex({name: [rec.nindex(name)[0]] * len(ilis)},
 338                ilis.addindex(ilis.field([rec.nindex(name)[0]] * len(ilis), name),
 339                              merge=mergeidx, update=updidx)
 340        return (ilis, oldname, newname)
 341
 342# %% special
 343    def __str__(self):
 344        '''return string format for var and lidx'''
 345        stri = ''
 346        if self.lvar:
 347            stri += 'variables :\n'
 348            for idx in self.lvar:
 349                stri += '    ' + str(idx) + '\n'
 350        if self.lidx:
 351            stri += 'index :\n'
 352            for idx in self.lidx:
 353                stri += '    ' + str(idx) + '\n'
 354        return stri
 355
 356    def __add__(self, other):
 357        ''' Add other's values to self's values in a new Dataset'''
 358        newil = copy(self)
 359        newil.__iadd__(other)
 360        return newil
 361
 362    def __iadd__(self, other):
 363        ''' Add other's values to self's values'''
 364        return self.add(other, name=True, solve=False)
 365
 366    def __or__(self, other):
 367        ''' Add other's index to self's index in a new Dataset'''
 368        newil = copy(self)
 369        newil.__ior__(other)
 370        return newil
 371
 372    def __ior__(self, other):
 373        ''' Add other's index to self's index'''
 374        return self.orindex(other, first=False, merge=True, update=False)
 375
 376# %% property
 377    @property
 378    def consistent(self):
 379        ''' True if all the record are different'''
 380        selfiidx = self.iidx
 381        if not selfiidx:
 382            return True
 383        return max(Counter(zip(*selfiidx)).values()) == 1
 384
 385    @property
 386    def extidx(self):
 387        '''idx values (see data model)'''
 388        return [idx.values for idx in self.lidx]
 389
 390    @property
 391    def extidxext(self):
 392        '''idx val (see data model)'''
 393        return [idx.val for idx in self.lidx]
 394
 395    @property
 396    def idxname(self):
 397        ''' list of idx name'''
 398        return [idx.name for idx in self.lidx]
 399
 400    @property
 401    def idxlen(self):
 402        ''' list of idx codec length'''
 403        return [len(idx.codec) for idx in self.lidx]
 404
 405    @property
 406    def iidx(self):
 407        ''' list of keys for each idx'''
 408        return [idx.keys for idx in self.lidx]
 409
 410    @property
 411    def lenidx(self):
 412        ''' number of idx'''
 413        return len(self.lidx)
 414
 415    @property
 416    def lidx(self):
 417        '''list of idx'''
 418        return [self.lindex[i] for i in self.lidxrow]
 419
 420    @property
 421    def lisvar(self):
 422        '''list of boolean : True if Field is var'''
 423        return [name in self.lvarname for name in self.lname]
 424
 425    @property
 426    def lvar(self):
 427        '''list of var'''
 428        return [self.lindex[i] for i in self.lvarrow]
 429
 430    @property
 431    def lvarrow(self):
 432        '''list of var row'''
 433        return [self.lname.index(name) for name in self.lvarname]
 434
 435    @property
 436    def lidxrow(self):
 437        '''list of idx row'''
 438        return [i for i in range(self.lenindex) if i not in self.lvarrow]
 439
 440    @property
 441    def primary(self):
 442        ''' list of primary idx'''
 443        return [self.lidxrow.index(self.lname.index(name)) for name in self.primaryname]
 444
 445    @property
 446    def secondary(self):
 447        ''' list of secondary idx'''
 448        return [self.lidxrow.index(self.lname.index(name)) for name in self.secondaryname]
 449
 450    @property
 451    def setidx(self):
 452        '''list of codec for each idx'''
 453        return [idx.codec for idx in self.lidx]
 454
 455    @property
 456    def zip(self):
 457        '''return a zip format for transpose(extidx) : tuple(tuple(rec))'''
 458        textidx = Cutil.transpose(self.extidx)
 459        if not textidx:
 460            return None
 461        return tuple(tuple(idx) for idx in textidx)
 462
 463    # %% structure
 464    def addindex(self, index, first=False, merge=False, update=False):
 465        '''add a new index.
 466
 467        *Parameters*
 468
 469        - **index** : Field - index to add (can be index Ntv representation)
 470        - **first** : If True insert index at the first row, else at the end
 471        - **merge** : create a new index if merge is False
 472        - **update** : if True, update actual values if index name is present (and merge is True)
 473
 474        *Returns* : none '''
 475        idx = self.field.ntv(index)
 476        idxname = self.lname
 477        if len(idx) != len(self) and len(self) > 0:
 478            raise DatasetError('sizes are different')
 479        if not idx.name in idxname:
 480            if first:
 481                self.lindex.insert(0, idx)
 482            else:
 483                self.lindex.append(idx)
 484        elif not merge:  # si idx.name in idxname
 485            while idx.name in idxname:
 486                idx.name += '(2)'
 487            if first:
 488                self.lindex.insert(0, idx)
 489            else:
 490                self.lindex.append(idx)
 491        elif update:  # si merge et si idx.name in idxname
 492            self.lindex[idxname.index(idx.name)].setlistvalue(idx.values)
 493
 494    def append(self, record, unique=False):
 495        '''add a new record.
 496
 497        *Parameters*
 498
 499        - **record** :  list of new index values to add to Dataset
 500        - **unique** :  boolean (default False) - Append isn't done if unique
 501        is True and record present
 502
 503        *Returns* : list - key record'''
 504        if self.lenindex != len(record):
 505            raise DatasetError('len(record) not consistent')
 506        record = self.field.l_to_i(record)
 507        if self.isinrecord(self.idxrecord(record), False) and unique:
 508            return None
 509        return [self.lindex[i].append(record[i]) for i in range(self.lenindex)]
 510
 511    def applyfilter(self, reverse=False, filtname=FILTER, delfilter=True, inplace=True):
 512        '''delete records with defined filter value.
 513        Filter is deleted after record filtering.
 514
 515        *Parameters*
 516
 517        - **reverse** :  boolean (default False) - delete record with filter's
 518        value is reverse
 519        - **filtname** : string (default FILTER) - Name of the filter Field added
 520        - **delfilter** :  boolean (default True) - If True, delete filter's Field
 521        - **inplace** : boolean (default True) - if True, filter is apply to self,
 522
 523        *Returns* : self or new Dataset'''
 524        if not filtname in self.lname:
 525            return None
 526        if inplace:
 527            ilis = self
 528        else:
 529            ilis = copy(self)
 530        ifilt = ilis.lname.index(filtname)
 531        ilis.sort([ifilt], reverse=not reverse, func=None)
 532        lisind = ilis.lindex[ifilt].recordfromvalue(reverse)
 533        if lisind:
 534            minind = min(lisind)
 535            for idx in ilis.lindex:
 536                del idx.keys[minind:]
 537        if inplace:
 538            self.delindex(filtname)
 539        else:
 540            ilis.delindex(filtname)
 541            if delfilter:
 542                self.delindex(filtname)
 543        ilis.reindex()
 544        return ilis
 545
 546    def coupling(self, derived=True, level=0.1):
 547        '''Transform idx with low dist in coupled or derived indexes (codec extension).
 548
 549        *Parameters*
 550
 551        - **level** : float (default 0.1) - param threshold to apply coupling.
 552        - **derived** : boolean (default : True). If True, indexes are derived,
 553        else coupled.
 554
 555        *Returns* : None'''
 556        ana = self.analysis
 557        child = [[]] * len(ana)
 558        childroot = []
 559        level = level * len(self)
 560        for idx in range(self.lenindex):
 561            if derived:
 562                iparent = ana.fields[idx].p_distomin.index
 563            else:
 564                iparent = ana.fields[idx].p_distance.index
 565            if iparent == -1:
 566                childroot.append(idx)
 567            else:
 568                child[iparent].append(idx)
 569        for idx in childroot:
 570            self._couplingidx(idx, child, derived, level, ana)
 571
 572    def _couplingidx(self, idx, child, derived, level, ana):
 573        ''' Field coupling (included childrens of the Field)'''
 574        fields = ana.fields
 575        if derived:
 576            iparent = fields[idx].p_distomin.index
 577            dparent = ana.get_relation(*sorted([idx, iparent])).distomin
 578        else:
 579            iparent = fields[idx].p_distance.index
 580            dparent = ana.get_relation(*sorted([idx, iparent])).distance
 581        # if fields[idx].category in ('coupled', 'unique') or iparent == -1\
 582        if fields[idx].category in ('coupled', 'unique') \
 583                or dparent >= level or dparent == 0:
 584            return
 585        if child[idx]:
 586            for childidx in child[idx]:
 587                self._couplingidx(childidx, child, derived, level, ana)
 588        self.lindex[iparent].coupling(self.lindex[idx], derived=derived,
 589                                      duplicate=False)
 590        return
 591
 592    def delrecord(self, record, extern=True):
 593        '''remove a record.
 594
 595        *Parameters*
 596
 597        - **record** :  list - index values to remove to Dataset
 598        - **extern** : if True, compare record values to external representation
 599        of self.value, else, internal
 600
 601        *Returns* : row deleted'''
 602        self.reindex()
 603        reckeys = self.valtokey(record, extern=extern)
 604        if None in reckeys:
 605            return None
 606        row = self.tiindex.index(reckeys)
 607        for idx in self:
 608            del idx[row]
 609        return row
 610
 611    def _fullindex(self, ind, keysadd, indexname, varname, leng, fillvalue, fillextern):
 612        if not varname:
 613            varname = []
 614        idx = self.lindex[ind]
 615        lenadd = len(keysadd[0])
 616        if len(idx) == leng:
 617            return
 618        #inf = self.indexinfos()
 619        ana = self.anafields
 620        parent = ana[ind].p_derived.view('index')
 621        # if inf[ind]['cat'] == 'unique':
 622        if ana[ind].category == 'unique':
 623            idx.set_keys(idx.keys + [0] * lenadd)
 624        elif self.lname[ind] in indexname:
 625            idx.set_keys(idx.keys + keysadd[indexname.index(self.lname[ind])])
 626        # elif inf[ind]['parent'] == -1 or self.lname[ind] in varname:
 627        elif parent == -1 or self.lname[ind] in varname:
 628            fillval = fillvalue
 629            if fillextern:
 630                fillval = self.field.s_to_i(fillvalue)
 631            idx.set_keys(idx.keys + [len(idx.codec)] * len(keysadd[0]))
 632            idx.set_codec(idx.codec + [fillval])
 633        else:
 634            #parent = inf[ind]['parent']
 635            if len(self.lindex[parent]) != leng:
 636                self._fullindex(parent, keysadd, indexname, varname, leng,
 637                                fillvalue, fillextern)
 638            # if inf[ind]['cat'] == 'coupled':
 639            if ana[ind].category == 'coupled':
 640                idx.tocoupled(self.lindex[parent], coupling=True)
 641            else:
 642                idx.tocoupled(self.lindex[parent], coupling=False)
 643
 644    def full(self, reindex=False, idxname=None, varname=None, fillvalue='-',
 645             fillextern=True, inplace=True, canonical=True):
 646        '''tranform a list of indexes in crossed indexes (value extension).
 647
 648        *Parameters*
 649
 650        - **idxname** : list of string - name of indexes to transform
 651        - **varname** : string - name of indexes to use
 652        - **reindex** : boolean (default False) - if True, set default codec
 653        before transformation
 654        - **fillvalue** : object value used for var extension
 655        - **fillextern** : boolean(default True) - if True, fillvalue is converted
 656        to internal value
 657        - **inplace** : boolean (default True) - if True, filter is apply to self,
 658        - **canonical** : boolean (default True) - if True, Field are ordered
 659        in canonical order
 660
 661        *Returns* : self or new Dataset'''
 662        ilis = self if inplace else copy(self)
 663        if not idxname:
 664            idxname = ilis.primaryname
 665        if reindex:
 666            ilis.reindex()
 667        keysadd = Cutil.idxfull([ilis.nindex(name) for name in idxname])
 668        if keysadd and len(keysadd) != 0:
 669            newlen = len(keysadd[0]) + len(ilis)
 670            for ind in range(ilis.lenindex):
 671                ilis._fullindex(ind, keysadd, idxname, varname, newlen,
 672                                fillvalue, fillextern)
 673        if canonical:
 674            ilis.setcanonorder()
 675        return ilis
 676
 677    def getduplicates(self, indexname=None, resindex=None, indexview=None):
 678        '''check duplicate cod in a list of indexes. Result is add in a new
 679        index or returned.
 680
 681        *Parameters*
 682
 683        - **indexname** : list of string (default none) - name of indexes to check
 684        (if None, all Field)
 685        - **resindex** : string (default None) - Add a new index named resindex
 686        with check result (False if duplicate)
 687        - **indexview** : list of str (default None) - list of fields to return
 688
 689        *Returns* : list of int - list of rows with duplicate cod '''
 690        if not indexname:
 691            indexname = self.lname
 692        duplicates = []
 693        for name in indexname:
 694            duplicates += self.nindex(name).getduplicates()
 695        if resindex and isinstance(resindex, str):
 696            newidx = self.field([True] * len(self), name=resindex)
 697            for item in duplicates:
 698                newidx[item] = False
 699            self.addindex(newidx)
 700        dupl = tuple(set(duplicates))
 701        if not indexview:
 702            return dupl
 703        return [tuple(self.record(ind, indexview)) for ind in dupl]
 704
 705    def iscanonorder(self):
 706        '''return True if primary indexes have canonical ordered keys'''
 707        primary = self.primary
 708        canonorder = Cutil.canonorder(
 709            [len(self.lidx[idx].codec) for idx in primary])
 710        return canonorder == [self.lidx[idx].keys for idx in primary]
 711
 712    def isinrecord(self, record, extern=True):
 713        '''Check if record is present in self.
 714
 715        *Parameters*
 716
 717        - **record** : list - value for each Field
 718        - **extern** : if True, compare record values to external representation
 719        of self.value, else, internal
 720
 721        *Returns boolean* : True if found'''
 722        if extern:
 723            return record in Cutil.transpose(self.extidxext)
 724        return record in Cutil.transpose(self.extidx)
 725
 726    def idxrecord(self, record):
 727        '''return rec array (without variable) from complete record (with variable)'''
 728        return [record[self.lidxrow[i]] for i in range(len(self.lidxrow))]
 729
 730    def keytoval(self, listkey, extern=True):
 731        '''
 732        convert a keys list (key for each index) to a values list (value for each index).
 733
 734        *Parameters*
 735
 736        - **listkey** : key for each index
 737        - **extern** : boolean (default True) - if True, compare rec to val else to values
 738
 739        *Returns*
 740
 741        - **list** : value for each index'''
 742        return [idx.keytoval(key, extern=extern) for idx, key in zip(self.lindex, listkey)]
 743
 744    def loc(self, rec, extern=True, row=False):
 745        '''
 746        Return record or row corresponding to a list of idx values.
 747
 748        *Parameters*
 749
 750        - **rec** : list - value for each idx
 751        - **extern** : boolean (default True) - if True, compare rec to val,
 752        else to values
 753        - **row** : Boolean (default False) - if True, return list of row,
 754        else list of records
 755
 756        *Returns*
 757
 758        - **object** : variable value or None if not found'''
 759        locrow = None
 760        try:
 761            if len(rec) == self.lenindex:
 762                locrow = list(set.intersection(*[set(self.lindex[i].loc(rec[i], extern))
 763                                               for i in range(self.lenindex)]))
 764            elif len(rec) == self.lenidx:
 765                locrow = list(set.intersection(*[set(self.lidx[i].loc(rec[i], extern))
 766                                               for i in range(self.lenidx)]))
 767        except:
 768            pass
 769        if locrow is None:
 770            return None
 771        if row:
 772            return locrow
 773        return [self.record(locr, extern=extern) for locr in locrow]
 774
 775    def mix(self, other, fillvalue=None):
 776        '''add other Field not included in self and add other's values'''
 777        sname = set(self.lname)
 778        oname = set(other.lname)
 779        newself = copy(self)
 780        copother = copy(other)
 781        for nam in oname - sname:
 782            newself.addindex({nam: [fillvalue] * len(newself)})
 783        for nam in sname - oname:
 784            copother.addindex({nam: [fillvalue] * len(copother)})
 785        return newself.add(copother, name=True, solve=False)
 786
 787    def merging(self, listname=None):
 788        ''' add a new Field build with Field define in listname.
 789        Values of the new Field are set of values in listname Field'''
 790        #self.addindex(Field.merging([self.nindex(name) for name in listname]))
 791        self.addindex(Sfield.merging([self.nindex(name) for name in listname]))
 792
 793    def orindex(self, other, first=False, merge=False, update=False):
 794        ''' Add other's index to self's index (with same length)
 795
 796        *Parameters*
 797
 798        - **other** : self class - object to add
 799        - **first** : Boolean (default False) - If True insert indexes
 800        at the first row, else at the end
 801        - **merge** : Boolean (default False) - create a new index
 802        if merge is False
 803        - **update** : Boolean (default False) - if True, update actual
 804        values if index name is present (and merge is True)
 805
 806        *Returns* : none '''
 807        if len(self) != 0 and len(self) != len(other) and len(other) != 0:
 808            raise DatasetError("the sizes are not equal")
 809        otherc = copy(other)
 810        for idx in otherc.lindex:
 811            self.addindex(idx, first=first, merge=merge, update=update)
 812        return self
 813
 814    def record(self, row, indexname=None, extern=True):
 815        '''return the record at the row
 816
 817        *Parameters*
 818
 819        - **row** : int - row of the record
 820        - **extern** : boolean (default True) - if True, return val record else
 821        value record
 822        - **indexname** : list of str (default None) - list of fields to return
 823        *Returns*
 824
 825        - **list** : val record or value record'''
 826        if indexname is None:
 827            indexname = self.lname
 828        if extern:
 829            record = [idx.val[row] for idx in self.lindex]
 830            #record = [idx.values[row].to_obj() for idx in self.lindex]
 831            #record = [idx.valrow(row) for idx in self.lindex]
 832        else:
 833            record = [idx.values[row] for idx in self.lindex]
 834        return [record[self.lname.index(name)] for name in indexname]
 835
 836    def recidx(self, row, extern=True):
 837        '''return the list of idx val or values at the row
 838
 839        *Parameters*
 840
 841        - **row** : int - row of the record
 842        - **extern** : boolean (default True) - if True, return val rec else value rec
 843
 844        *Returns*
 845
 846        - **list** : val or value for idx'''
 847        if extern:
 848            return [idx.values[row].to_obj() for idx in self.lidx]
 849            # return [idx.valrow(row) for idx in self.lidx]
 850        return [idx.values[row] for idx in self.lidx]
 851
 852    def recvar(self, row, extern=True):
 853        '''return the list of var val or values at the row
 854
 855        *Parameters*
 856
 857        - **row** : int - row of the record
 858        - **extern** : boolean (default True) - if True, return val rec else value rec
 859
 860        *Returns*
 861
 862        - **list** : val or value for var'''
 863        if extern:
 864            return [idx.values[row].to_obj() for idx in self.lvar]
 865            # return [idx.valrow(row) for idx in self.lvar]
 866        return [idx.values[row] for idx in self.lvar]
 867
 868    def setcanonorder(self, reindex=False):
 869        '''Set the canonical index order : primary - secondary/unique - variable.
 870        Set the canonical keys order : ordered keys in the first columns.
 871
 872        *Parameters*
 873        - **reindex** : boolean (default False) - if True, set default codec after
 874        transformation
 875
 876        *Return* : self'''
 877        order = self.primaryname
 878        order += self.secondaryname
 879        order += self.lvarname
 880        order += self.lunicname
 881        self.swapindex(order)
 882        self.sort(reindex=reindex)
 883        # self.analysis.actualize()
 884        return self
 885
 886    def setfilter(self, filt=None, first=False, filtname=FILTER, unique=False):
 887        '''Add a filter index with boolean values
 888
 889        - **filt** : list of boolean - values of the filter idx to add
 890        - **first** : boolean (default False) - If True insert index at the first row,
 891        else at the end
 892        - **filtname** : string (default FILTER) - Name of the filter Field added
 893
 894        *Returns* : self'''
 895        if not filt:
 896            filt = [True] * len(self)
 897        idx = self.field(filt, name=filtname)
 898        idx.reindex()
 899        if not idx.cod in ([True, False], [False, True], [True], [False]):
 900            raise DatasetError('filt is not consistent')
 901        if unique:
 902            for name in self.lname:
 903                if name[:len(FILTER)] == FILTER:
 904                    self.delindex(FILTER)
 905        self.addindex(idx, first=first)
 906        return self
 907
 908    def sort(self, order=None, reverse=False, func=str, reindex=True):
 909        '''Sort data following the index order and apply the ascending or descending
 910        sort function to values.
 911
 912        *Parameters*
 913
 914        - **order** : list (default None)- new order of index to apply. If None or [],
 915        the sort function is applied to the existing order of indexes.
 916        - **reverse** : boolean (default False)- ascending if True, descending if False
 917        - **func**    : function (default str) - parameter key used in the sorted function
 918        - **reindex** : boolean (default True) - if True, apply a new codec order (key = func)
 919
 920        *Returns* : self'''
 921        if not order:
 922            order = list(range(self.lenindex))
 923        orderfull = order + list(set(range(self.lenindex)) - set(order))
 924        if reindex:
 925            for i in order:
 926                self.lindex[i].reindex(codec=sorted(
 927                    self.lindex[i].codec, key=func))
 928        newidx = Cutil.transpose(sorted(Cutil.transpose(
 929            [self.lindex[orderfull[i]].keys for i in range(self.lenindex)]),
 930            reverse=reverse))
 931        for i in range(self.lenindex):
 932            self.lindex[orderfull[i]].set_keys(newidx[i])
 933        return self
 934
 935    """
 936    def swapindex(self, order):
 937        '''
 938        Change the order of the index .
 939
 940        *Parameters*
 941
 942        - **order** : list of int or list of name - new order of index to apply.
 943
 944        *Returns* : self '''
 945        if self.lenindex != len(order):
 946            raise DatasetError('length of order and Dataset different')
 947        if not order or isinstance(order[0], int):
 948            self.lindex = [self.lindex[ind] for ind in order]
 949        elif isinstance(order[0], str):
 950            self.lindex = [self.nindex(name) for name in order]
 951        return self
 952    """
 953
 954    def tostdcodec(self, inplace=False, full=True):
 955        '''Transform all codec in full or default codec.
 956
 957        *Parameters*
 958
 959        - **inplace** : boolean  (default False) - if True apply transformation
 960        to self, else to a new Dataset
 961        - **full** : boolean (default True)- full codec if True, default if False
 962
 963
 964        *Return Dataset* : self or new Dataset'''
 965        lindex = [idx.tostdcodec(inplace=False, full=full)
 966                  for idx in self.lindex]
 967        if inplace:
 968            self.lindex = lindex
 969            return self
 970        return self.__class__(lindex, self.lvarname)
 971
 972    def updateindex(self, listvalue, index, extern=True):
 973        '''update values of an index.
 974
 975        *Parameters*
 976
 977        - **listvalue** : list - index values to replace
 978        - **index** : integer - index row to update
 979        - **extern** : if True, the listvalue has external representation, else internal
 980
 981        *Returns* : none '''
 982        self.lindex[index].setlistvalue(listvalue, extern=extern)
 983
 984    def valtokey(self, rec, extern=True):
 985        '''convert a record list (value or val for each idx) to a key list
 986        (key for each index).
 987
 988        *Parameters*
 989
 990        - **rec** : list of value or val for each index
 991        - **extern** : if True, the rec value has external representation, else internal
 992
 993        *Returns*
 994
 995        - **list of int** : record key for each index'''
 996        return [idx.valtokey(val, extern=extern) for idx, val in zip(self.lindex, rec)]
 997
 998class Ndataset(Sdataset):
 999    # %% Ndataset
1000    '''    
1001    `Ndataset` is a child class of Cdataset where internal value are NTV entities.
1002    
1003    All the methods are the same as `Sdataset`.
1004    '''
1005    field_class = Nfield
class Sdataset(tab_dataset.dataset_interface.DatasetInterface, tab_dataset.cdataset.Cdataset):
 26class Sdataset(DatasetInterface, Cdataset):
 27    # %% intro
 28    '''
 29    `Sdataset` is a child class of Cdataset where internal value can be different
 30    from external value (list is converted in tuple and dict in json-object).
 31    
 32    One attribute is added: 'field' to define the 'field' class.
 33
 34    The methods defined in this class are :
 35
 36    *constructor (@classmethod)*
 37
 38    - `Sdataset.from_csv`
 39    - `Sdataset.from_file`
 40    - `Sdataset.merge`
 41    - `Sdataset.ext`
 42    - `Cdataset.ntv`
 43    - `Cdataset.from_ntv`
 44
 45    *dynamic value - module analysis (getters @property)*
 46
 47    - `DatasetAnalysis.analysis`
 48    - `DatasetAnalysis.anafields`
 49    - `Sdataset.extidx`
 50    - `Sdataset.extidxext`
 51    - `DatasetAnalysis.field_partition`
 52    - `Sdataset.idxname`
 53    - `Sdataset.idxlen`
 54    - `Sdataset.iidx`
 55    - `Sdataset.lenidx`
 56    - `Sdataset.lidx`
 57    - `Sdataset.lidxrow`
 58    - `Sdataset.lisvar`
 59    - `Sdataset.lvar`
 60    - `DatasetAnalysis.lvarname`
 61    - `Sdataset.lvarrow`
 62    - `Cdataset.lunicname`
 63    - `Cdataset.lunicrow`
 64    - `DatasetAnalysis.partitions`
 65    - `DatasetAnalysis.primaryname`
 66    - `DatasetAnalysis.relation`
 67    - `DatasetAnalysis.secondaryname`
 68    - `Sdataset.setidx`
 69    - `Sdataset.zip`
 70
 71    *dynamic value (getters @property)*
 72
 73    - `Cdataset.keys`
 74    - `Cdataset.iindex`
 75    - `Cdataset.indexlen`
 76    - `Cdataset.lenindex`
 77    - `Cdataset.lname`
 78    - `Cdataset.tiindex`
 79
 80    *global value (getters @property)*
 81
 82    - `DatasetAnalysis.complete`
 83    - `Sdataset.consistent`
 84    - `DatasetAnalysis.dimension`
 85    - `Sdataset.primary`
 86    - `Sdataset.secondary`
 87
 88    *selecting - infos methods*
 89
 90    - `Sdataset.idxrecord`
 91    - `DatasetAnalysis.indexinfos`
 92    - `DatasetAnalysis.indicator`
 93    - `Sdataset.iscanonorder`
 94    - `Sdataset.isinrecord`
 95    - `Sdataset.keytoval`
 96    - `Sdataset.loc`
 97    - `Cdataset.nindex`
 98    - `Sdataset.record`
 99    - `Sdataset.recidx`
100    - `Sdataset.recvar`
101    - `Cdataset.to_analysis`
102    - `DatasetAnalysis.tree`
103    - `Sdataset.valtokey`
104
105    *add - update methods*
106
107    - `Cdataset.add`
108    - `Sdataset.addindex`
109    - `Sdataset.append`
110    - `Cdataset.delindex`
111    - `Sdataset.delrecord`
112    - `Sdataset.orindex`
113    - `Cdataset.renameindex`
114    - `Cdataset.setname`
115    - `Sdataset.updateindex`
116
117    *structure management - methods*
118
119    - `Sdataset.applyfilter`
120    - `Cdataset.check_relation`
121    - `Cdataset.check_relationship`
122    - `Sdataset.coupling`
123    - `Sdataset.full`
124    - `Sdataset.getduplicates`
125    - `Sdataset.mix`
126    - `Sdataset.merging`
127    - `Cdataset.reindex`
128    - `Cdataset.reorder`
129    - `Sdataset.setfilter`
130    - `Sdataset.sort`
131    - `Cdataset.swapindex`
132    - `Sdataset.setcanonorder`
133    - `Sdataset.tostdcodec`
134
135    *exports methods (`observation.dataset_interface.DatasetInterface`)*
136
137    - `Dataset.json`
138    - `Dataset.plot`
139    - `Dataset.to_obj`
140    - `Dataset.to_csv`
141    - `Dataset.to_dataframe`
142    - `Dataset.to_file`
143    - `Dataset.to_ntv`
144    - `Dataset.to_obj`
145    - `Dataset.to_xarray`
146    - `Dataset.view`
147    - `Dataset.vlist`
148    - `Dataset.voxel`
149    '''
150
151    field_class = Sfield
152
153    def __init__(self, listidx=None, name=None, reindex=True):
154        '''
155        Dataset constructor.
156
157        *Parameters*
158
159        - **listidx** :  list (default None) - list of Field data
160        - **name** :  string (default None) - name of the dataset
161        - **reindex** : boolean (default True) - if True, default codec for each Field'''
162
163        self.field = self.field_class
164        Cdataset.__init__(self, listidx, name, reindex=reindex)
165
166    @classmethod
167    def from_csv(cls, filename='dataset.csv', header=True, nrow=None, decode_str=True,
168                 decode_json=True, optcsv={'quoting': csv.QUOTE_NONNUMERIC}):
169        '''
170        Dataset constructor (from a csv file). Each column represents index values.
171
172        *Parameters*
173
174        - **filename** : string (default 'dataset.csv'), name of the file to read
175        - **header** : boolean (default True). If True, the first raw is dedicated to names
176        - **nrow** : integer (default None). Number of row. If None, all the row else nrow
177        - **optcsv** : dict (default : quoting) - see csv.reader options'''
178        if not optcsv:
179            optcsv = {}
180        if not nrow:
181            nrow = -1
182        with open(filename, newline='', encoding="utf-8") as file:
183            reader = csv.reader(file, **optcsv)
184            irow = 0
185            for row in reader:
186                if irow == nrow:
187                    break
188                if irow == 0:
189                    idxval = [[] for i in range(len(row))]
190                    idxname = [''] * len(row)
191                if irow == 0 and header:
192                    idxname = row
193                else:
194                    for i in range(len(row)):
195                        if decode_json:
196                            try:
197                                idxval[i].append(json.loads(row[i]))
198                            except:
199                                idxval[i].append(row[i])
200                        else:
201                            idxval[i].append(row[i])
202                irow += 1
203        lindex = [cls.field_class.from_ntv(
204            {name: idx}, decode_str=decode_str) for idx, name in zip(idxval, idxname)]
205        return cls(listidx=lindex, reindex=True)
206
207    @classmethod
208    def from_file(cls, filename, forcestring=False, reindex=True, decode_str=False):
209        '''
210        Generate Object from file storage.
211
212         *Parameters*
213
214        - **filename** : string - file name (with path)
215        - **forcestring** : boolean (default False) - if True,
216        forces the UTF-8 data format, else the format is calculated
217        - **reindex** : boolean (default True) - if True, default codec for each Field
218        - **decode_str**: boolean (default False) - if True, string are loaded in json data
219
220        *Returns* : new Object'''
221        with open(filename, 'rb') as file:
222            btype = file.read(1)
223        if btype == bytes('[', 'UTF-8') or btype == bytes('{', 'UTF-8') or forcestring:
224            with open(filename, 'r', newline='', encoding="utf-8") as file:
225                bjson = file.read()
226        else:
227            with open(filename, 'rb') as file:
228                bjson = file.read()
229        return cls.from_ntv(bjson, reindex=reindex, decode_str=decode_str)
230
231    def merge(self, fillvalue=math.nan, reindex=False, simplename=False):
232        '''
233        Merge method replaces Dataset objects included into its constituents.
234
235        *Parameters*
236
237        - **fillvalue** : object (default nan) - value used for the additional data
238        - **reindex** : boolean (default False) - if True, set default codec after transformation
239        - **simplename** : boolean (default False) - if True, new Field name are
240        the same as merged Field name else it is a composed name.
241
242        *Returns*: merged Dataset '''
243        ilc = copy(self)
244        delname = []
245        row = ilc[0]
246        if not isinstance(row, list):
247            row = [row]
248        merged, oldname, newname = self.__class__._mergerecord(
249            self.ext(row, ilc.lname), simplename=simplename, fillvalue=fillvalue,
250            reindex=reindex)
251        delname.append(oldname)
252        for ind in range(1, len(ilc)):
253            oldidx = ilc.nindex(oldname)
254            for name in newname:
255                ilc.addindex(self.field(oldidx.codec, name, oldidx.keys))
256            row = ilc[ind]
257            if not isinstance(row, list):
258                row = [row]
259            rec, oldname, newname = self.__class__._mergerecord(
260                self.ext(row, ilc.lname), simplename=simplename)
261            if oldname and newname != [oldname]:
262                delname.append(oldname)
263            for name in newname:
264                oldidx = merged.nindex(oldname)
265                fillval = self.field.s_to_i(fillvalue)
266                merged.addindex(
267                    self.field([fillval] * len(merged), name, oldidx.keys))
268            merged += rec
269        for name in set(delname):
270            if name:
271                merged.delindex(name)
272        if reindex:
273            merged.reindex()
274        ilc.lindex = merged.lindex
275        return ilc
276
277    @classmethod
278    def ext(cls, idxval=None, idxname=None, reindex=True, fast=False):
279        '''
280        Dataset constructor (external index).
281
282        *Parameters*
283
284        - **idxval** : list of Field or list of values (see data model)
285        - **idxname** : list of string (default None) - list of Field name (see data model)'''
286        if idxval is None:
287            idxval = []
288        if not isinstance(idxval, list):
289            return None
290        val = []
291        for idx in idxval:
292            if not isinstance(idx, list):
293                val.append([idx])
294            else:
295                val.append(idx)
296        lenval = [len(idx) for idx in val]
297        if lenval and max(lenval) != min(lenval):
298            raise DatasetError('the length of Iindex are different')
299        length = lenval[0] if lenval else 0
300        idxname = [None] * len(val) if idxname is None else idxname
301        for ind, name in enumerate(idxname):
302            if name is None or name == '$default':
303                idxname[ind] = 'i'+str(ind)
304        lindex = [cls.field_class(codec, name, lendefault=length, reindex=reindex,
305                                  fast=fast) for codec, name in zip(val, idxname)]
306        return cls(lindex, reindex=False)
307
308# %% internal
309    @staticmethod
310    def _mergerecord(rec, mergeidx=True, updateidx=True, simplename=False, 
311                     fillvalue=math.nan, reindex=False):
312        row = rec[0]
313        if not isinstance(row, list):
314            row = [row]
315        var = -1
316        for ind, val in enumerate(row):
317            if val.__class__.__name__ in ['Sdataset', 'Ndataset']:
318                var = ind
319                break
320        if var < 0:
321            return (rec, None, [])
322        #ilis = row[var]
323        ilis = row[var].merge(simplename=simplename, fillvalue=fillvalue, reindex=reindex)
324        oldname = rec.lname[var]
325        if ilis.lname == ['i0']:
326            newname = [oldname]
327            ilis.setname(newname)
328        elif not simplename:
329            newname = [oldname + '_' + name for name in ilis.lname]
330            ilis.setname(newname)
331        else:
332            newname = copy(ilis.lname)
333        for name in rec.lname:
334            if name in newname:
335                newname.remove(name)
336            else:
337                updidx = name in ilis.lname and not updateidx
338                #ilis.addindex({name: [rec.nindex(name)[0]] * len(ilis)},
339                ilis.addindex(ilis.field([rec.nindex(name)[0]] * len(ilis), name),
340                              merge=mergeidx, update=updidx)
341        return (ilis, oldname, newname)
342
343# %% special
344    def __str__(self):
345        '''return string format for var and lidx'''
346        stri = ''
347        if self.lvar:
348            stri += 'variables :\n'
349            for idx in self.lvar:
350                stri += '    ' + str(idx) + '\n'
351        if self.lidx:
352            stri += 'index :\n'
353            for idx in self.lidx:
354                stri += '    ' + str(idx) + '\n'
355        return stri
356
357    def __add__(self, other):
358        ''' Add other's values to self's values in a new Dataset'''
359        newil = copy(self)
360        newil.__iadd__(other)
361        return newil
362
363    def __iadd__(self, other):
364        ''' Add other's values to self's values'''
365        return self.add(other, name=True, solve=False)
366
367    def __or__(self, other):
368        ''' Add other's index to self's index in a new Dataset'''
369        newil = copy(self)
370        newil.__ior__(other)
371        return newil
372
373    def __ior__(self, other):
374        ''' Add other's index to self's index'''
375        return self.orindex(other, first=False, merge=True, update=False)
376
377# %% property
378    @property
379    def consistent(self):
380        ''' True if all the record are different'''
381        selfiidx = self.iidx
382        if not selfiidx:
383            return True
384        return max(Counter(zip(*selfiidx)).values()) == 1
385
386    @property
387    def extidx(self):
388        '''idx values (see data model)'''
389        return [idx.values for idx in self.lidx]
390
391    @property
392    def extidxext(self):
393        '''idx val (see data model)'''
394        return [idx.val for idx in self.lidx]
395
396    @property
397    def idxname(self):
398        ''' list of idx name'''
399        return [idx.name for idx in self.lidx]
400
401    @property
402    def idxlen(self):
403        ''' list of idx codec length'''
404        return [len(idx.codec) for idx in self.lidx]
405
406    @property
407    def iidx(self):
408        ''' list of keys for each idx'''
409        return [idx.keys for idx in self.lidx]
410
411    @property
412    def lenidx(self):
413        ''' number of idx'''
414        return len(self.lidx)
415
416    @property
417    def lidx(self):
418        '''list of idx'''
419        return [self.lindex[i] for i in self.lidxrow]
420
421    @property
422    def lisvar(self):
423        '''list of boolean : True if Field is var'''
424        return [name in self.lvarname for name in self.lname]
425
426    @property
427    def lvar(self):
428        '''list of var'''
429        return [self.lindex[i] for i in self.lvarrow]
430
431    @property
432    def lvarrow(self):
433        '''list of var row'''
434        return [self.lname.index(name) for name in self.lvarname]
435
436    @property
437    def lidxrow(self):
438        '''list of idx row'''
439        return [i for i in range(self.lenindex) if i not in self.lvarrow]
440
441    @property
442    def primary(self):
443        ''' list of primary idx'''
444        return [self.lidxrow.index(self.lname.index(name)) for name in self.primaryname]
445
446    @property
447    def secondary(self):
448        ''' list of secondary idx'''
449        return [self.lidxrow.index(self.lname.index(name)) for name in self.secondaryname]
450
451    @property
452    def setidx(self):
453        '''list of codec for each idx'''
454        return [idx.codec for idx in self.lidx]
455
456    @property
457    def zip(self):
458        '''return a zip format for transpose(extidx) : tuple(tuple(rec))'''
459        textidx = Cutil.transpose(self.extidx)
460        if not textidx:
461            return None
462        return tuple(tuple(idx) for idx in textidx)
463
464    # %% structure
465    def addindex(self, index, first=False, merge=False, update=False):
466        '''add a new index.
467
468        *Parameters*
469
470        - **index** : Field - index to add (can be index Ntv representation)
471        - **first** : If True insert index at the first row, else at the end
472        - **merge** : create a new index if merge is False
473        - **update** : if True, update actual values if index name is present (and merge is True)
474
475        *Returns* : none '''
476        idx = self.field.ntv(index)
477        idxname = self.lname
478        if len(idx) != len(self) and len(self) > 0:
479            raise DatasetError('sizes are different')
480        if not idx.name in idxname:
481            if first:
482                self.lindex.insert(0, idx)
483            else:
484                self.lindex.append(idx)
485        elif not merge:  # si idx.name in idxname
486            while idx.name in idxname:
487                idx.name += '(2)'
488            if first:
489                self.lindex.insert(0, idx)
490            else:
491                self.lindex.append(idx)
492        elif update:  # si merge et si idx.name in idxname
493            self.lindex[idxname.index(idx.name)].setlistvalue(idx.values)
494
495    def append(self, record, unique=False):
496        '''add a new record.
497
498        *Parameters*
499
500        - **record** :  list of new index values to add to Dataset
501        - **unique** :  boolean (default False) - Append isn't done if unique
502        is True and record present
503
504        *Returns* : list - key record'''
505        if self.lenindex != len(record):
506            raise DatasetError('len(record) not consistent')
507        record = self.field.l_to_i(record)
508        if self.isinrecord(self.idxrecord(record), False) and unique:
509            return None
510        return [self.lindex[i].append(record[i]) for i in range(self.lenindex)]
511
512    def applyfilter(self, reverse=False, filtname=FILTER, delfilter=True, inplace=True):
513        '''delete records with defined filter value.
514        Filter is deleted after record filtering.
515
516        *Parameters*
517
518        - **reverse** :  boolean (default False) - delete record with filter's
519        value is reverse
520        - **filtname** : string (default FILTER) - Name of the filter Field added
521        - **delfilter** :  boolean (default True) - If True, delete filter's Field
522        - **inplace** : boolean (default True) - if True, filter is apply to self,
523
524        *Returns* : self or new Dataset'''
525        if not filtname in self.lname:
526            return None
527        if inplace:
528            ilis = self
529        else:
530            ilis = copy(self)
531        ifilt = ilis.lname.index(filtname)
532        ilis.sort([ifilt], reverse=not reverse, func=None)
533        lisind = ilis.lindex[ifilt].recordfromvalue(reverse)
534        if lisind:
535            minind = min(lisind)
536            for idx in ilis.lindex:
537                del idx.keys[minind:]
538        if inplace:
539            self.delindex(filtname)
540        else:
541            ilis.delindex(filtname)
542            if delfilter:
543                self.delindex(filtname)
544        ilis.reindex()
545        return ilis
546
547    def coupling(self, derived=True, level=0.1):
548        '''Transform idx with low dist in coupled or derived indexes (codec extension).
549
550        *Parameters*
551
552        - **level** : float (default 0.1) - param threshold to apply coupling.
553        - **derived** : boolean (default : True). If True, indexes are derived,
554        else coupled.
555
556        *Returns* : None'''
557        ana = self.analysis
558        child = [[]] * len(ana)
559        childroot = []
560        level = level * len(self)
561        for idx in range(self.lenindex):
562            if derived:
563                iparent = ana.fields[idx].p_distomin.index
564            else:
565                iparent = ana.fields[idx].p_distance.index
566            if iparent == -1:
567                childroot.append(idx)
568            else:
569                child[iparent].append(idx)
570        for idx in childroot:
571            self._couplingidx(idx, child, derived, level, ana)
572
573    def _couplingidx(self, idx, child, derived, level, ana):
574        ''' Field coupling (included childrens of the Field)'''
575        fields = ana.fields
576        if derived:
577            iparent = fields[idx].p_distomin.index
578            dparent = ana.get_relation(*sorted([idx, iparent])).distomin
579        else:
580            iparent = fields[idx].p_distance.index
581            dparent = ana.get_relation(*sorted([idx, iparent])).distance
582        # if fields[idx].category in ('coupled', 'unique') or iparent == -1\
583        if fields[idx].category in ('coupled', 'unique') \
584                or dparent >= level or dparent == 0:
585            return
586        if child[idx]:
587            for childidx in child[idx]:
588                self._couplingidx(childidx, child, derived, level, ana)
589        self.lindex[iparent].coupling(self.lindex[idx], derived=derived,
590                                      duplicate=False)
591        return
592
593    def delrecord(self, record, extern=True):
594        '''remove a record.
595
596        *Parameters*
597
598        - **record** :  list - index values to remove to Dataset
599        - **extern** : if True, compare record values to external representation
600        of self.value, else, internal
601
602        *Returns* : row deleted'''
603        self.reindex()
604        reckeys = self.valtokey(record, extern=extern)
605        if None in reckeys:
606            return None
607        row = self.tiindex.index(reckeys)
608        for idx in self:
609            del idx[row]
610        return row
611
612    def _fullindex(self, ind, keysadd, indexname, varname, leng, fillvalue, fillextern):
613        if not varname:
614            varname = []
615        idx = self.lindex[ind]
616        lenadd = len(keysadd[0])
617        if len(idx) == leng:
618            return
619        #inf = self.indexinfos()
620        ana = self.anafields
621        parent = ana[ind].p_derived.view('index')
622        # if inf[ind]['cat'] == 'unique':
623        if ana[ind].category == 'unique':
624            idx.set_keys(idx.keys + [0] * lenadd)
625        elif self.lname[ind] in indexname:
626            idx.set_keys(idx.keys + keysadd[indexname.index(self.lname[ind])])
627        # elif inf[ind]['parent'] == -1 or self.lname[ind] in varname:
628        elif parent == -1 or self.lname[ind] in varname:
629            fillval = fillvalue
630            if fillextern:
631                fillval = self.field.s_to_i(fillvalue)
632            idx.set_keys(idx.keys + [len(idx.codec)] * len(keysadd[0]))
633            idx.set_codec(idx.codec + [fillval])
634        else:
635            #parent = inf[ind]['parent']
636            if len(self.lindex[parent]) != leng:
637                self._fullindex(parent, keysadd, indexname, varname, leng,
638                                fillvalue, fillextern)
639            # if inf[ind]['cat'] == 'coupled':
640            if ana[ind].category == 'coupled':
641                idx.tocoupled(self.lindex[parent], coupling=True)
642            else:
643                idx.tocoupled(self.lindex[parent], coupling=False)
644
645    def full(self, reindex=False, idxname=None, varname=None, fillvalue='-',
646             fillextern=True, inplace=True, canonical=True):
647        '''tranform a list of indexes in crossed indexes (value extension).
648
649        *Parameters*
650
651        - **idxname** : list of string - name of indexes to transform
652        - **varname** : string - name of indexes to use
653        - **reindex** : boolean (default False) - if True, set default codec
654        before transformation
655        - **fillvalue** : object value used for var extension
656        - **fillextern** : boolean(default True) - if True, fillvalue is converted
657        to internal value
658        - **inplace** : boolean (default True) - if True, filter is apply to self,
659        - **canonical** : boolean (default True) - if True, Field are ordered
660        in canonical order
661
662        *Returns* : self or new Dataset'''
663        ilis = self if inplace else copy(self)
664        if not idxname:
665            idxname = ilis.primaryname
666        if reindex:
667            ilis.reindex()
668        keysadd = Cutil.idxfull([ilis.nindex(name) for name in idxname])
669        if keysadd and len(keysadd) != 0:
670            newlen = len(keysadd[0]) + len(ilis)
671            for ind in range(ilis.lenindex):
672                ilis._fullindex(ind, keysadd, idxname, varname, newlen,
673                                fillvalue, fillextern)
674        if canonical:
675            ilis.setcanonorder()
676        return ilis
677
678    def getduplicates(self, indexname=None, resindex=None, indexview=None):
679        '''check duplicate cod in a list of indexes. Result is add in a new
680        index or returned.
681
682        *Parameters*
683
684        - **indexname** : list of string (default none) - name of indexes to check
685        (if None, all Field)
686        - **resindex** : string (default None) - Add a new index named resindex
687        with check result (False if duplicate)
688        - **indexview** : list of str (default None) - list of fields to return
689
690        *Returns* : list of int - list of rows with duplicate cod '''
691        if not indexname:
692            indexname = self.lname
693        duplicates = []
694        for name in indexname:
695            duplicates += self.nindex(name).getduplicates()
696        if resindex and isinstance(resindex, str):
697            newidx = self.field([True] * len(self), name=resindex)
698            for item in duplicates:
699                newidx[item] = False
700            self.addindex(newidx)
701        dupl = tuple(set(duplicates))
702        if not indexview:
703            return dupl
704        return [tuple(self.record(ind, indexview)) for ind in dupl]
705
706    def iscanonorder(self):
707        '''return True if primary indexes have canonical ordered keys'''
708        primary = self.primary
709        canonorder = Cutil.canonorder(
710            [len(self.lidx[idx].codec) for idx in primary])
711        return canonorder == [self.lidx[idx].keys for idx in primary]
712
713    def isinrecord(self, record, extern=True):
714        '''Check if record is present in self.
715
716        *Parameters*
717
718        - **record** : list - value for each Field
719        - **extern** : if True, compare record values to external representation
720        of self.value, else, internal
721
722        *Returns boolean* : True if found'''
723        if extern:
724            return record in Cutil.transpose(self.extidxext)
725        return record in Cutil.transpose(self.extidx)
726
727    def idxrecord(self, record):
728        '''return rec array (without variable) from complete record (with variable)'''
729        return [record[self.lidxrow[i]] for i in range(len(self.lidxrow))]
730
731    def keytoval(self, listkey, extern=True):
732        '''
733        convert a keys list (key for each index) to a values list (value for each index).
734
735        *Parameters*
736
737        - **listkey** : key for each index
738        - **extern** : boolean (default True) - if True, compare rec to val else to values
739
740        *Returns*
741
742        - **list** : value for each index'''
743        return [idx.keytoval(key, extern=extern) for idx, key in zip(self.lindex, listkey)]
744
745    def loc(self, rec, extern=True, row=False):
746        '''
747        Return record or row corresponding to a list of idx values.
748
749        *Parameters*
750
751        - **rec** : list - value for each idx
752        - **extern** : boolean (default True) - if True, compare rec to val,
753        else to values
754        - **row** : Boolean (default False) - if True, return list of row,
755        else list of records
756
757        *Returns*
758
759        - **object** : variable value or None if not found'''
760        locrow = None
761        try:
762            if len(rec) == self.lenindex:
763                locrow = list(set.intersection(*[set(self.lindex[i].loc(rec[i], extern))
764                                               for i in range(self.lenindex)]))
765            elif len(rec) == self.lenidx:
766                locrow = list(set.intersection(*[set(self.lidx[i].loc(rec[i], extern))
767                                               for i in range(self.lenidx)]))
768        except:
769            pass
770        if locrow is None:
771            return None
772        if row:
773            return locrow
774        return [self.record(locr, extern=extern) for locr in locrow]
775
776    def mix(self, other, fillvalue=None):
777        '''add other Field not included in self and add other's values'''
778        sname = set(self.lname)
779        oname = set(other.lname)
780        newself = copy(self)
781        copother = copy(other)
782        for nam in oname - sname:
783            newself.addindex({nam: [fillvalue] * len(newself)})
784        for nam in sname - oname:
785            copother.addindex({nam: [fillvalue] * len(copother)})
786        return newself.add(copother, name=True, solve=False)
787
788    def merging(self, listname=None):
789        ''' add a new Field build with Field define in listname.
790        Values of the new Field are set of values in listname Field'''
791        #self.addindex(Field.merging([self.nindex(name) for name in listname]))
792        self.addindex(Sfield.merging([self.nindex(name) for name in listname]))
793
794    def orindex(self, other, first=False, merge=False, update=False):
795        ''' Add other's index to self's index (with same length)
796
797        *Parameters*
798
799        - **other** : self class - object to add
800        - **first** : Boolean (default False) - If True insert indexes
801        at the first row, else at the end
802        - **merge** : Boolean (default False) - create a new index
803        if merge is False
804        - **update** : Boolean (default False) - if True, update actual
805        values if index name is present (and merge is True)
806
807        *Returns* : none '''
808        if len(self) != 0 and len(self) != len(other) and len(other) != 0:
809            raise DatasetError("the sizes are not equal")
810        otherc = copy(other)
811        for idx in otherc.lindex:
812            self.addindex(idx, first=first, merge=merge, update=update)
813        return self
814
815    def record(self, row, indexname=None, extern=True):
816        '''return the record at the row
817
818        *Parameters*
819
820        - **row** : int - row of the record
821        - **extern** : boolean (default True) - if True, return val record else
822        value record
823        - **indexname** : list of str (default None) - list of fields to return
824        *Returns*
825
826        - **list** : val record or value record'''
827        if indexname is None:
828            indexname = self.lname
829        if extern:
830            record = [idx.val[row] for idx in self.lindex]
831            #record = [idx.values[row].to_obj() for idx in self.lindex]
832            #record = [idx.valrow(row) for idx in self.lindex]
833        else:
834            record = [idx.values[row] for idx in self.lindex]
835        return [record[self.lname.index(name)] for name in indexname]
836
837    def recidx(self, row, extern=True):
838        '''return the list of idx val or values at the row
839
840        *Parameters*
841
842        - **row** : int - row of the record
843        - **extern** : boolean (default True) - if True, return val rec else value rec
844
845        *Returns*
846
847        - **list** : val or value for idx'''
848        if extern:
849            return [idx.values[row].to_obj() for idx in self.lidx]
850            # return [idx.valrow(row) for idx in self.lidx]
851        return [idx.values[row] for idx in self.lidx]
852
853    def recvar(self, row, extern=True):
854        '''return the list of var val or values at the row
855
856        *Parameters*
857
858        - **row** : int - row of the record
859        - **extern** : boolean (default True) - if True, return val rec else value rec
860
861        *Returns*
862
863        - **list** : val or value for var'''
864        if extern:
865            return [idx.values[row].to_obj() for idx in self.lvar]
866            # return [idx.valrow(row) for idx in self.lvar]
867        return [idx.values[row] for idx in self.lvar]
868
869    def setcanonorder(self, reindex=False):
870        '''Set the canonical index order : primary - secondary/unique - variable.
871        Set the canonical keys order : ordered keys in the first columns.
872
873        *Parameters*
874        - **reindex** : boolean (default False) - if True, set default codec after
875        transformation
876
877        *Return* : self'''
878        order = self.primaryname
879        order += self.secondaryname
880        order += self.lvarname
881        order += self.lunicname
882        self.swapindex(order)
883        self.sort(reindex=reindex)
884        # self.analysis.actualize()
885        return self
886
887    def setfilter(self, filt=None, first=False, filtname=FILTER, unique=False):
888        '''Add a filter index with boolean values
889
890        - **filt** : list of boolean - values of the filter idx to add
891        - **first** : boolean (default False) - If True insert index at the first row,
892        else at the end
893        - **filtname** : string (default FILTER) - Name of the filter Field added
894
895        *Returns* : self'''
896        if not filt:
897            filt = [True] * len(self)
898        idx = self.field(filt, name=filtname)
899        idx.reindex()
900        if not idx.cod in ([True, False], [False, True], [True], [False]):
901            raise DatasetError('filt is not consistent')
902        if unique:
903            for name in self.lname:
904                if name[:len(FILTER)] == FILTER:
905                    self.delindex(FILTER)
906        self.addindex(idx, first=first)
907        return self
908
909    def sort(self, order=None, reverse=False, func=str, reindex=True):
910        '''Sort data following the index order and apply the ascending or descending
911        sort function to values.
912
913        *Parameters*
914
915        - **order** : list (default None)- new order of index to apply. If None or [],
916        the sort function is applied to the existing order of indexes.
917        - **reverse** : boolean (default False)- ascending if True, descending if False
918        - **func**    : function (default str) - parameter key used in the sorted function
919        - **reindex** : boolean (default True) - if True, apply a new codec order (key = func)
920
921        *Returns* : self'''
922        if not order:
923            order = list(range(self.lenindex))
924        orderfull = order + list(set(range(self.lenindex)) - set(order))
925        if reindex:
926            for i in order:
927                self.lindex[i].reindex(codec=sorted(
928                    self.lindex[i].codec, key=func))
929        newidx = Cutil.transpose(sorted(Cutil.transpose(
930            [self.lindex[orderfull[i]].keys for i in range(self.lenindex)]),
931            reverse=reverse))
932        for i in range(self.lenindex):
933            self.lindex[orderfull[i]].set_keys(newidx[i])
934        return self
935
936    """
937    def swapindex(self, order):
938        '''
939        Change the order of the index .
940
941        *Parameters*
942
943        - **order** : list of int or list of name - new order of index to apply.
944
945        *Returns* : self '''
946        if self.lenindex != len(order):
947            raise DatasetError('length of order and Dataset different')
948        if not order or isinstance(order[0], int):
949            self.lindex = [self.lindex[ind] for ind in order]
950        elif isinstance(order[0], str):
951            self.lindex = [self.nindex(name) for name in order]
952        return self
953    """
954
955    def tostdcodec(self, inplace=False, full=True):
956        '''Transform all codec in full or default codec.
957
958        *Parameters*
959
960        - **inplace** : boolean  (default False) - if True apply transformation
961        to self, else to a new Dataset
962        - **full** : boolean (default True)- full codec if True, default if False
963
964
965        *Return Dataset* : self or new Dataset'''
966        lindex = [idx.tostdcodec(inplace=False, full=full)
967                  for idx in self.lindex]
968        if inplace:
969            self.lindex = lindex
970            return self
971        return self.__class__(lindex, self.lvarname)
972
973    def updateindex(self, listvalue, index, extern=True):
974        '''update values of an index.
975
976        *Parameters*
977
978        - **listvalue** : list - index values to replace
979        - **index** : integer - index row to update
980        - **extern** : if True, the listvalue has external representation, else internal
981
982        *Returns* : none '''
983        self.lindex[index].setlistvalue(listvalue, extern=extern)
984
985    def valtokey(self, rec, extern=True):
986        '''convert a record list (value or val for each idx) to a key list
987        (key for each index).
988
989        *Parameters*
990
991        - **rec** : list of value or val for each index
992        - **extern** : if True, the rec value has external representation, else internal
993
994        *Returns*
995
996        - **list of int** : record key for each index'''
997        return [idx.valtokey(val, extern=extern) for idx, val in zip(self.lindex, rec)]

Sdataset is a child class of Cdataset where internal value can be different from external value (list is converted in tuple and dict in json-object).

One attribute is added: 'field' to define the 'field' class.

The methods defined in this class are :

constructor (@classmethod)

dynamic value - module analysis (getters @property)

dynamic value (getters @property)

  • Cdataset.keys
  • Cdataset.iindex
  • Cdataset.indexlen
  • Cdataset.lenindex
  • Cdataset.lname
  • Cdataset.tiindex

global value (getters @property)

selecting - infos methods

add - update methods

structure management - methods

exports methods (observation.dataset_interface.DatasetInterface)

  • Dataset.json
  • Dataset.plot
  • Dataset.to_obj
  • Dataset.to_csv
  • Dataset.to_dataframe
  • Dataset.to_file
  • Dataset.to_ntv
  • Dataset.to_obj
  • Dataset.to_xarray
  • Dataset.view
  • Dataset.vlist
  • Dataset.voxel
Sdataset(listidx=None, name=None, reindex=True)
153    def __init__(self, listidx=None, name=None, reindex=True):
154        '''
155        Dataset constructor.
156
157        *Parameters*
158
159        - **listidx** :  list (default None) - list of Field data
160        - **name** :  string (default None) - name of the dataset
161        - **reindex** : boolean (default True) - if True, default codec for each Field'''
162
163        self.field = self.field_class
164        Cdataset.__init__(self, listidx, name, reindex=reindex)

Dataset constructor.

Parameters

  • listidx : list (default None) - list of Field data
  • name : string (default None) - name of the dataset
  • reindex : boolean (default True) - if True, default codec for each Field
@classmethod
def from_csv( cls, filename='dataset.csv', header=True, nrow=None, decode_str=True, decode_json=True, optcsv={'quoting': 2}):
166    @classmethod
167    def from_csv(cls, filename='dataset.csv', header=True, nrow=None, decode_str=True,
168                 decode_json=True, optcsv={'quoting': csv.QUOTE_NONNUMERIC}):
169        '''
170        Dataset constructor (from a csv file). Each column represents index values.
171
172        *Parameters*
173
174        - **filename** : string (default 'dataset.csv'), name of the file to read
175        - **header** : boolean (default True). If True, the first raw is dedicated to names
176        - **nrow** : integer (default None). Number of row. If None, all the row else nrow
177        - **optcsv** : dict (default : quoting) - see csv.reader options'''
178        if not optcsv:
179            optcsv = {}
180        if not nrow:
181            nrow = -1
182        with open(filename, newline='', encoding="utf-8") as file:
183            reader = csv.reader(file, **optcsv)
184            irow = 0
185            for row in reader:
186                if irow == nrow:
187                    break
188                if irow == 0:
189                    idxval = [[] for i in range(len(row))]
190                    idxname = [''] * len(row)
191                if irow == 0 and header:
192                    idxname = row
193                else:
194                    for i in range(len(row)):
195                        if decode_json:
196                            try:
197                                idxval[i].append(json.loads(row[i]))
198                            except:
199                                idxval[i].append(row[i])
200                        else:
201                            idxval[i].append(row[i])
202                irow += 1
203        lindex = [cls.field_class.from_ntv(
204            {name: idx}, decode_str=decode_str) for idx, name in zip(idxval, idxname)]
205        return cls(listidx=lindex, reindex=True)

Dataset constructor (from a csv file). Each column represents index values.

Parameters

  • filename : string (default 'dataset.csv'), name of the file to read
  • header : boolean (default True). If True, the first raw is dedicated to names
  • nrow : integer (default None). Number of row. If None, all the row else nrow
  • optcsv : dict (default : quoting) - see csv.reader options
@classmethod
def from_file(cls, filename, forcestring=False, reindex=True, decode_str=False):
207    @classmethod
208    def from_file(cls, filename, forcestring=False, reindex=True, decode_str=False):
209        '''
210        Generate Object from file storage.
211
212         *Parameters*
213
214        - **filename** : string - file name (with path)
215        - **forcestring** : boolean (default False) - if True,
216        forces the UTF-8 data format, else the format is calculated
217        - **reindex** : boolean (default True) - if True, default codec for each Field
218        - **decode_str**: boolean (default False) - if True, string are loaded in json data
219
220        *Returns* : new Object'''
221        with open(filename, 'rb') as file:
222            btype = file.read(1)
223        if btype == bytes('[', 'UTF-8') or btype == bytes('{', 'UTF-8') or forcestring:
224            with open(filename, 'r', newline='', encoding="utf-8") as file:
225                bjson = file.read()
226        else:
227            with open(filename, 'rb') as file:
228                bjson = file.read()
229        return cls.from_ntv(bjson, reindex=reindex, decode_str=decode_str)

Generate Object from file storage.

Parameters

  • filename : string - file name (with path)
  • forcestring : boolean (default False) - if True, forces the UTF-8 data format, else the format is calculated
  • reindex : boolean (default True) - if True, default codec for each Field
  • decode_str: boolean (default False) - if True, string are loaded in json data

Returns : new Object

def merge(self, fillvalue=nan, reindex=False, simplename=False):
231    def merge(self, fillvalue=math.nan, reindex=False, simplename=False):
232        '''
233        Merge method replaces Dataset objects included into its constituents.
234
235        *Parameters*
236
237        - **fillvalue** : object (default nan) - value used for the additional data
238        - **reindex** : boolean (default False) - if True, set default codec after transformation
239        - **simplename** : boolean (default False) - if True, new Field name are
240        the same as merged Field name else it is a composed name.
241
242        *Returns*: merged Dataset '''
243        ilc = copy(self)
244        delname = []
245        row = ilc[0]
246        if not isinstance(row, list):
247            row = [row]
248        merged, oldname, newname = self.__class__._mergerecord(
249            self.ext(row, ilc.lname), simplename=simplename, fillvalue=fillvalue,
250            reindex=reindex)
251        delname.append(oldname)
252        for ind in range(1, len(ilc)):
253            oldidx = ilc.nindex(oldname)
254            for name in newname:
255                ilc.addindex(self.field(oldidx.codec, name, oldidx.keys))
256            row = ilc[ind]
257            if not isinstance(row, list):
258                row = [row]
259            rec, oldname, newname = self.__class__._mergerecord(
260                self.ext(row, ilc.lname), simplename=simplename)
261            if oldname and newname != [oldname]:
262                delname.append(oldname)
263            for name in newname:
264                oldidx = merged.nindex(oldname)
265                fillval = self.field.s_to_i(fillvalue)
266                merged.addindex(
267                    self.field([fillval] * len(merged), name, oldidx.keys))
268            merged += rec
269        for name in set(delname):
270            if name:
271                merged.delindex(name)
272        if reindex:
273            merged.reindex()
274        ilc.lindex = merged.lindex
275        return ilc

Merge method replaces Dataset objects included into its constituents.

Parameters

  • fillvalue : object (default nan) - value used for the additional data
  • reindex : boolean (default False) - if True, set default codec after transformation
  • simplename : boolean (default False) - if True, new Field name are the same as merged Field name else it is a composed name.

Returns: merged Dataset

@classmethod
def ext(cls, idxval=None, idxname=None, reindex=True, fast=False):
277    @classmethod
278    def ext(cls, idxval=None, idxname=None, reindex=True, fast=False):
279        '''
280        Dataset constructor (external index).
281
282        *Parameters*
283
284        - **idxval** : list of Field or list of values (see data model)
285        - **idxname** : list of string (default None) - list of Field name (see data model)'''
286        if idxval is None:
287            idxval = []
288        if not isinstance(idxval, list):
289            return None
290        val = []
291        for idx in idxval:
292            if not isinstance(idx, list):
293                val.append([idx])
294            else:
295                val.append(idx)
296        lenval = [len(idx) for idx in val]
297        if lenval and max(lenval) != min(lenval):
298            raise DatasetError('the length of Iindex are different')
299        length = lenval[0] if lenval else 0
300        idxname = [None] * len(val) if idxname is None else idxname
301        for ind, name in enumerate(idxname):
302            if name is None or name == '$default':
303                idxname[ind] = 'i'+str(ind)
304        lindex = [cls.field_class(codec, name, lendefault=length, reindex=reindex,
305                                  fast=fast) for codec, name in zip(val, idxname)]
306        return cls(lindex, reindex=False)

Dataset constructor (external index).

Parameters

  • idxval : list of Field or list of values (see data model)
  • idxname : list of string (default None) - list of Field name (see data model)
consistent

True if all the record are different

extidx

idx values (see data model)

extidxext

idx val (see data model)

idxname

list of idx name

idxlen

list of idx codec length

iidx

list of keys for each idx

lenidx

number of idx

lidx

list of idx

lisvar

list of boolean : True if Field is var

lvar

list of var

lvarrow

list of var row

lidxrow

list of idx row

primary

list of primary idx

secondary

list of secondary idx

setidx

list of codec for each idx

zip

return a zip format for transpose(extidx) : tuple(tuple(rec))

def addindex(self, index, first=False, merge=False, update=False):
465    def addindex(self, index, first=False, merge=False, update=False):
466        '''add a new index.
467
468        *Parameters*
469
470        - **index** : Field - index to add (can be index Ntv representation)
471        - **first** : If True insert index at the first row, else at the end
472        - **merge** : create a new index if merge is False
473        - **update** : if True, update actual values if index name is present (and merge is True)
474
475        *Returns* : none '''
476        idx = self.field.ntv(index)
477        idxname = self.lname
478        if len(idx) != len(self) and len(self) > 0:
479            raise DatasetError('sizes are different')
480        if not idx.name in idxname:
481            if first:
482                self.lindex.insert(0, idx)
483            else:
484                self.lindex.append(idx)
485        elif not merge:  # si idx.name in idxname
486            while idx.name in idxname:
487                idx.name += '(2)'
488            if first:
489                self.lindex.insert(0, idx)
490            else:
491                self.lindex.append(idx)
492        elif update:  # si merge et si idx.name in idxname
493            self.lindex[idxname.index(idx.name)].setlistvalue(idx.values)

add a new index.

Parameters

  • index : Field - index to add (can be index Ntv representation)
  • first : If True insert index at the first row, else at the end
  • merge : create a new index if merge is False
  • update : if True, update actual values if index name is present (and merge is True)

Returns : none

def append(self, record, unique=False):
495    def append(self, record, unique=False):
496        '''add a new record.
497
498        *Parameters*
499
500        - **record** :  list of new index values to add to Dataset
501        - **unique** :  boolean (default False) - Append isn't done if unique
502        is True and record present
503
504        *Returns* : list - key record'''
505        if self.lenindex != len(record):
506            raise DatasetError('len(record) not consistent')
507        record = self.field.l_to_i(record)
508        if self.isinrecord(self.idxrecord(record), False) and unique:
509            return None
510        return [self.lindex[i].append(record[i]) for i in range(self.lenindex)]

add a new record.

Parameters

  • record : list of new index values to add to Dataset
  • unique : boolean (default False) - Append isn't done if unique is True and record present

Returns : list - key record

def applyfilter( self, reverse=False, filtname='$filter', delfilter=True, inplace=True):
512    def applyfilter(self, reverse=False, filtname=FILTER, delfilter=True, inplace=True):
513        '''delete records with defined filter value.
514        Filter is deleted after record filtering.
515
516        *Parameters*
517
518        - **reverse** :  boolean (default False) - delete record with filter's
519        value is reverse
520        - **filtname** : string (default FILTER) - Name of the filter Field added
521        - **delfilter** :  boolean (default True) - If True, delete filter's Field
522        - **inplace** : boolean (default True) - if True, filter is apply to self,
523
524        *Returns* : self or new Dataset'''
525        if not filtname in self.lname:
526            return None
527        if inplace:
528            ilis = self
529        else:
530            ilis = copy(self)
531        ifilt = ilis.lname.index(filtname)
532        ilis.sort([ifilt], reverse=not reverse, func=None)
533        lisind = ilis.lindex[ifilt].recordfromvalue(reverse)
534        if lisind:
535            minind = min(lisind)
536            for idx in ilis.lindex:
537                del idx.keys[minind:]
538        if inplace:
539            self.delindex(filtname)
540        else:
541            ilis.delindex(filtname)
542            if delfilter:
543                self.delindex(filtname)
544        ilis.reindex()
545        return ilis

delete records with defined filter value. Filter is deleted after record filtering.

Parameters

  • reverse : boolean (default False) - delete record with filter's value is reverse
  • filtname : string (default FILTER) - Name of the filter Field added
  • delfilter : boolean (default True) - If True, delete filter's Field
  • inplace : boolean (default True) - if True, filter is apply to self,

Returns : self or new Dataset

def coupling(self, derived=True, level=0.1):
547    def coupling(self, derived=True, level=0.1):
548        '''Transform idx with low dist in coupled or derived indexes (codec extension).
549
550        *Parameters*
551
552        - **level** : float (default 0.1) - param threshold to apply coupling.
553        - **derived** : boolean (default : True). If True, indexes are derived,
554        else coupled.
555
556        *Returns* : None'''
557        ana = self.analysis
558        child = [[]] * len(ana)
559        childroot = []
560        level = level * len(self)
561        for idx in range(self.lenindex):
562            if derived:
563                iparent = ana.fields[idx].p_distomin.index
564            else:
565                iparent = ana.fields[idx].p_distance.index
566            if iparent == -1:
567                childroot.append(idx)
568            else:
569                child[iparent].append(idx)
570        for idx in childroot:
571            self._couplingidx(idx, child, derived, level, ana)

Transform idx with low dist in coupled or derived indexes (codec extension).

Parameters

  • level : float (default 0.1) - param threshold to apply coupling.
  • derived : boolean (default : True). If True, indexes are derived, else coupled.

Returns : None

def delrecord(self, record, extern=True):
593    def delrecord(self, record, extern=True):
594        '''remove a record.
595
596        *Parameters*
597
598        - **record** :  list - index values to remove to Dataset
599        - **extern** : if True, compare record values to external representation
600        of self.value, else, internal
601
602        *Returns* : row deleted'''
603        self.reindex()
604        reckeys = self.valtokey(record, extern=extern)
605        if None in reckeys:
606            return None
607        row = self.tiindex.index(reckeys)
608        for idx in self:
609            del idx[row]
610        return row

remove a record.

Parameters

  • record : list - index values to remove to Dataset
  • extern : if True, compare record values to external representation of self.value, else, internal

Returns : row deleted

def full( self, reindex=False, idxname=None, varname=None, fillvalue='-', fillextern=True, inplace=True, canonical=True):
645    def full(self, reindex=False, idxname=None, varname=None, fillvalue='-',
646             fillextern=True, inplace=True, canonical=True):
647        '''tranform a list of indexes in crossed indexes (value extension).
648
649        *Parameters*
650
651        - **idxname** : list of string - name of indexes to transform
652        - **varname** : string - name of indexes to use
653        - **reindex** : boolean (default False) - if True, set default codec
654        before transformation
655        - **fillvalue** : object value used for var extension
656        - **fillextern** : boolean(default True) - if True, fillvalue is converted
657        to internal value
658        - **inplace** : boolean (default True) - if True, filter is apply to self,
659        - **canonical** : boolean (default True) - if True, Field are ordered
660        in canonical order
661
662        *Returns* : self or new Dataset'''
663        ilis = self if inplace else copy(self)
664        if not idxname:
665            idxname = ilis.primaryname
666        if reindex:
667            ilis.reindex()
668        keysadd = Cutil.idxfull([ilis.nindex(name) for name in idxname])
669        if keysadd and len(keysadd) != 0:
670            newlen = len(keysadd[0]) + len(ilis)
671            for ind in range(ilis.lenindex):
672                ilis._fullindex(ind, keysadd, idxname, varname, newlen,
673                                fillvalue, fillextern)
674        if canonical:
675            ilis.setcanonorder()
676        return ilis

tranform a list of indexes in crossed indexes (value extension).

Parameters

  • idxname : list of string - name of indexes to transform
  • varname : string - name of indexes to use
  • reindex : boolean (default False) - if True, set default codec before transformation
  • fillvalue : object value used for var extension
  • fillextern : boolean(default True) - if True, fillvalue is converted to internal value
  • inplace : boolean (default True) - if True, filter is apply to self,
  • canonical : boolean (default True) - if True, Field are ordered in canonical order

Returns : self or new Dataset

def getduplicates(self, indexname=None, resindex=None, indexview=None):
678    def getduplicates(self, indexname=None, resindex=None, indexview=None):
679        '''check duplicate cod in a list of indexes. Result is add in a new
680        index or returned.
681
682        *Parameters*
683
684        - **indexname** : list of string (default none) - name of indexes to check
685        (if None, all Field)
686        - **resindex** : string (default None) - Add a new index named resindex
687        with check result (False if duplicate)
688        - **indexview** : list of str (default None) - list of fields to return
689
690        *Returns* : list of int - list of rows with duplicate cod '''
691        if not indexname:
692            indexname = self.lname
693        duplicates = []
694        for name in indexname:
695            duplicates += self.nindex(name).getduplicates()
696        if resindex and isinstance(resindex, str):
697            newidx = self.field([True] * len(self), name=resindex)
698            for item in duplicates:
699                newidx[item] = False
700            self.addindex(newidx)
701        dupl = tuple(set(duplicates))
702        if not indexview:
703            return dupl
704        return [tuple(self.record(ind, indexview)) for ind in dupl]

check duplicate cod in a list of indexes. Result is add in a new index or returned.

Parameters

  • indexname : list of string (default none) - name of indexes to check (if None, all Field)
  • resindex : string (default None) - Add a new index named resindex with check result (False if duplicate)
  • indexview : list of str (default None) - list of fields to return

Returns : list of int - list of rows with duplicate cod

def iscanonorder(self):
706    def iscanonorder(self):
707        '''return True if primary indexes have canonical ordered keys'''
708        primary = self.primary
709        canonorder = Cutil.canonorder(
710            [len(self.lidx[idx].codec) for idx in primary])
711        return canonorder == [self.lidx[idx].keys for idx in primary]

return True if primary indexes have canonical ordered keys

def isinrecord(self, record, extern=True):
713    def isinrecord(self, record, extern=True):
714        '''Check if record is present in self.
715
716        *Parameters*
717
718        - **record** : list - value for each Field
719        - **extern** : if True, compare record values to external representation
720        of self.value, else, internal
721
722        *Returns boolean* : True if found'''
723        if extern:
724            return record in Cutil.transpose(self.extidxext)
725        return record in Cutil.transpose(self.extidx)

Check if record is present in self.

Parameters

  • record : list - value for each Field
  • extern : if True, compare record values to external representation of self.value, else, internal

Returns boolean : True if found

def idxrecord(self, record):
727    def idxrecord(self, record):
728        '''return rec array (without variable) from complete record (with variable)'''
729        return [record[self.lidxrow[i]] for i in range(len(self.lidxrow))]

return rec array (without variable) from complete record (with variable)

def keytoval(self, listkey, extern=True):
731    def keytoval(self, listkey, extern=True):
732        '''
733        convert a keys list (key for each index) to a values list (value for each index).
734
735        *Parameters*
736
737        - **listkey** : key for each index
738        - **extern** : boolean (default True) - if True, compare rec to val else to values
739
740        *Returns*
741
742        - **list** : value for each index'''
743        return [idx.keytoval(key, extern=extern) for idx, key in zip(self.lindex, listkey)]

convert a keys list (key for each index) to a values list (value for each index).

Parameters

  • listkey : key for each index
  • extern : boolean (default True) - if True, compare rec to val else to values

Returns

  • list : value for each index
def loc(self, rec, extern=True, row=False):
745    def loc(self, rec, extern=True, row=False):
746        '''
747        Return record or row corresponding to a list of idx values.
748
749        *Parameters*
750
751        - **rec** : list - value for each idx
752        - **extern** : boolean (default True) - if True, compare rec to val,
753        else to values
754        - **row** : Boolean (default False) - if True, return list of row,
755        else list of records
756
757        *Returns*
758
759        - **object** : variable value or None if not found'''
760        locrow = None
761        try:
762            if len(rec) == self.lenindex:
763                locrow = list(set.intersection(*[set(self.lindex[i].loc(rec[i], extern))
764                                               for i in range(self.lenindex)]))
765            elif len(rec) == self.lenidx:
766                locrow = list(set.intersection(*[set(self.lidx[i].loc(rec[i], extern))
767                                               for i in range(self.lenidx)]))
768        except:
769            pass
770        if locrow is None:
771            return None
772        if row:
773            return locrow
774        return [self.record(locr, extern=extern) for locr in locrow]

Return record or row corresponding to a list of idx values.

Parameters

  • rec : list - value for each idx
  • extern : boolean (default True) - if True, compare rec to val, else to values
  • row : Boolean (default False) - if True, return list of row, else list of records

Returns

  • object : variable value or None if not found
def mix(self, other, fillvalue=None):
776    def mix(self, other, fillvalue=None):
777        '''add other Field not included in self and add other's values'''
778        sname = set(self.lname)
779        oname = set(other.lname)
780        newself = copy(self)
781        copother = copy(other)
782        for nam in oname - sname:
783            newself.addindex({nam: [fillvalue] * len(newself)})
784        for nam in sname - oname:
785            copother.addindex({nam: [fillvalue] * len(copother)})
786        return newself.add(copother, name=True, solve=False)

add other Field not included in self and add other's values

def merging(self, listname=None):
788    def merging(self, listname=None):
789        ''' add a new Field build with Field define in listname.
790        Values of the new Field are set of values in listname Field'''
791        #self.addindex(Field.merging([self.nindex(name) for name in listname]))
792        self.addindex(Sfield.merging([self.nindex(name) for name in listname]))

add a new Field build with Field define in listname. Values of the new Field are set of values in listname Field

def orindex(self, other, first=False, merge=False, update=False):
794    def orindex(self, other, first=False, merge=False, update=False):
795        ''' Add other's index to self's index (with same length)
796
797        *Parameters*
798
799        - **other** : self class - object to add
800        - **first** : Boolean (default False) - If True insert indexes
801        at the first row, else at the end
802        - **merge** : Boolean (default False) - create a new index
803        if merge is False
804        - **update** : Boolean (default False) - if True, update actual
805        values if index name is present (and merge is True)
806
807        *Returns* : none '''
808        if len(self) != 0 and len(self) != len(other) and len(other) != 0:
809            raise DatasetError("the sizes are not equal")
810        otherc = copy(other)
811        for idx in otherc.lindex:
812            self.addindex(idx, first=first, merge=merge, update=update)
813        return self

Add other's index to self's index (with same length)

Parameters

  • other : self class - object to add
  • first : Boolean (default False) - If True insert indexes at the first row, else at the end
  • merge : Boolean (default False) - create a new index if merge is False
  • update : Boolean (default False) - if True, update actual values if index name is present (and merge is True)

Returns : none

def record(self, row, indexname=None, extern=True):
815    def record(self, row, indexname=None, extern=True):
816        '''return the record at the row
817
818        *Parameters*
819
820        - **row** : int - row of the record
821        - **extern** : boolean (default True) - if True, return val record else
822        value record
823        - **indexname** : list of str (default None) - list of fields to return
824        *Returns*
825
826        - **list** : val record or value record'''
827        if indexname is None:
828            indexname = self.lname
829        if extern:
830            record = [idx.val[row] for idx in self.lindex]
831            #record = [idx.values[row].to_obj() for idx in self.lindex]
832            #record = [idx.valrow(row) for idx in self.lindex]
833        else:
834            record = [idx.values[row] for idx in self.lindex]
835        return [record[self.lname.index(name)] for name in indexname]

return the record at the row

Parameters

  • row : int - row of the record
  • extern : boolean (default True) - if True, return val record else value record
  • indexname : list of str (default None) - list of fields to return Returns

  • list : val record or value record

def recidx(self, row, extern=True):
837    def recidx(self, row, extern=True):
838        '''return the list of idx val or values at the row
839
840        *Parameters*
841
842        - **row** : int - row of the record
843        - **extern** : boolean (default True) - if True, return val rec else value rec
844
845        *Returns*
846
847        - **list** : val or value for idx'''
848        if extern:
849            return [idx.values[row].to_obj() for idx in self.lidx]
850            # return [idx.valrow(row) for idx in self.lidx]
851        return [idx.values[row] for idx in self.lidx]

return the list of idx val or values at the row

Parameters

  • row : int - row of the record
  • extern : boolean (default True) - if True, return val rec else value rec

Returns

  • list : val or value for idx
def recvar(self, row, extern=True):
853    def recvar(self, row, extern=True):
854        '''return the list of var val or values at the row
855
856        *Parameters*
857
858        - **row** : int - row of the record
859        - **extern** : boolean (default True) - if True, return val rec else value rec
860
861        *Returns*
862
863        - **list** : val or value for var'''
864        if extern:
865            return [idx.values[row].to_obj() for idx in self.lvar]
866            # return [idx.valrow(row) for idx in self.lvar]
867        return [idx.values[row] for idx in self.lvar]

return the list of var val or values at the row

Parameters

  • row : int - row of the record
  • extern : boolean (default True) - if True, return val rec else value rec

Returns

  • list : val or value for var
def setcanonorder(self, reindex=False):
869    def setcanonorder(self, reindex=False):
870        '''Set the canonical index order : primary - secondary/unique - variable.
871        Set the canonical keys order : ordered keys in the first columns.
872
873        *Parameters*
874        - **reindex** : boolean (default False) - if True, set default codec after
875        transformation
876
877        *Return* : self'''
878        order = self.primaryname
879        order += self.secondaryname
880        order += self.lvarname
881        order += self.lunicname
882        self.swapindex(order)
883        self.sort(reindex=reindex)
884        # self.analysis.actualize()
885        return self

Set the canonical index order : primary - secondary/unique - variable. Set the canonical keys order : ordered keys in the first columns.

Parameters

  • reindex : boolean (default False) - if True, set default codec after transformation

Return : self

def setfilter(self, filt=None, first=False, filtname='$filter', unique=False):
887    def setfilter(self, filt=None, first=False, filtname=FILTER, unique=False):
888        '''Add a filter index with boolean values
889
890        - **filt** : list of boolean - values of the filter idx to add
891        - **first** : boolean (default False) - If True insert index at the first row,
892        else at the end
893        - **filtname** : string (default FILTER) - Name of the filter Field added
894
895        *Returns* : self'''
896        if not filt:
897            filt = [True] * len(self)
898        idx = self.field(filt, name=filtname)
899        idx.reindex()
900        if not idx.cod in ([True, False], [False, True], [True], [False]):
901            raise DatasetError('filt is not consistent')
902        if unique:
903            for name in self.lname:
904                if name[:len(FILTER)] == FILTER:
905                    self.delindex(FILTER)
906        self.addindex(idx, first=first)
907        return self

Add a filter index with boolean values

  • filt : list of boolean - values of the filter idx to add
  • first : boolean (default False) - If True insert index at the first row, else at the end
  • filtname : string (default FILTER) - Name of the filter Field added

Returns : self

def sort(self, order=None, reverse=False, func=<class 'str'>, reindex=True):
909    def sort(self, order=None, reverse=False, func=str, reindex=True):
910        '''Sort data following the index order and apply the ascending or descending
911        sort function to values.
912
913        *Parameters*
914
915        - **order** : list (default None)- new order of index to apply. If None or [],
916        the sort function is applied to the existing order of indexes.
917        - **reverse** : boolean (default False)- ascending if True, descending if False
918        - **func**    : function (default str) - parameter key used in the sorted function
919        - **reindex** : boolean (default True) - if True, apply a new codec order (key = func)
920
921        *Returns* : self'''
922        if not order:
923            order = list(range(self.lenindex))
924        orderfull = order + list(set(range(self.lenindex)) - set(order))
925        if reindex:
926            for i in order:
927                self.lindex[i].reindex(codec=sorted(
928                    self.lindex[i].codec, key=func))
929        newidx = Cutil.transpose(sorted(Cutil.transpose(
930            [self.lindex[orderfull[i]].keys for i in range(self.lenindex)]),
931            reverse=reverse))
932        for i in range(self.lenindex):
933            self.lindex[orderfull[i]].set_keys(newidx[i])
934        return self

Sort data following the index order and apply the ascending or descending sort function to values.

Parameters

  • order : list (default None)- new order of index to apply. If None or [], the sort function is applied to the existing order of indexes.
  • reverse : boolean (default False)- ascending if True, descending if False
  • func : function (default str) - parameter key used in the sorted function
  • reindex : boolean (default True) - if True, apply a new codec order (key = func)

Returns : self

def tostdcodec(self, inplace=False, full=True):
955    def tostdcodec(self, inplace=False, full=True):
956        '''Transform all codec in full or default codec.
957
958        *Parameters*
959
960        - **inplace** : boolean  (default False) - if True apply transformation
961        to self, else to a new Dataset
962        - **full** : boolean (default True)- full codec if True, default if False
963
964
965        *Return Dataset* : self or new Dataset'''
966        lindex = [idx.tostdcodec(inplace=False, full=full)
967                  for idx in self.lindex]
968        if inplace:
969            self.lindex = lindex
970            return self
971        return self.__class__(lindex, self.lvarname)

Transform all codec in full or default codec.

Parameters

  • inplace : boolean (default False) - if True apply transformation to self, else to a new Dataset
  • full : boolean (default True)- full codec if True, default if False

Return Dataset : self or new Dataset

def updateindex(self, listvalue, index, extern=True):
973    def updateindex(self, listvalue, index, extern=True):
974        '''update values of an index.
975
976        *Parameters*
977
978        - **listvalue** : list - index values to replace
979        - **index** : integer - index row to update
980        - **extern** : if True, the listvalue has external representation, else internal
981
982        *Returns* : none '''
983        self.lindex[index].setlistvalue(listvalue, extern=extern)

update values of an index.

Parameters

  • listvalue : list - index values to replace
  • index : integer - index row to update
  • extern : if True, the listvalue has external representation, else internal

Returns : none

def valtokey(self, rec, extern=True):
985    def valtokey(self, rec, extern=True):
986        '''convert a record list (value or val for each idx) to a key list
987        (key for each index).
988
989        *Parameters*
990
991        - **rec** : list of value or val for each index
992        - **extern** : if True, the rec value has external representation, else internal
993
994        *Returns*
995
996        - **list of int** : record key for each index'''
997        return [idx.valtokey(val, extern=extern) for idx, val in zip(self.lindex, rec)]

convert a record list (value or val for each idx) to a key list (key for each index).

Parameters

  • rec : list of value or val for each index
  • extern : if True, the rec value has external representation, else internal

Returns

  • list of int : record key for each index
Inherited Members
tab_dataset.dataset_interface.DatasetInterface
json
plot
to_csv
to_dataframe
to_file
to_ntv
to_xarray
voxel
view
vlist
tab_dataset.cdataset.Cdataset
indexlen
iindex
keys
lenindex
lunicname
lunicrow
lname
tiindex
ntv
from_ntv
add
to_analysis
reindex
delindex
nindex
renameindex
reorder
setname
swapindex
check_relation
check_relationship
tab_dataset.cdataset.DatasetAnalysis
analysis
anafields
partitions
complete
dimension
lvarname
primaryname
secondaryname
indexinfos
field_partition
relation
tree
indicator
class Ndataset(Sdataset):
 999class Ndataset(Sdataset):
1000    # %% Ndataset
1001    '''    
1002    `Ndataset` is a child class of Cdataset where internal value are NTV entities.
1003    
1004    All the methods are the same as `Sdataset`.
1005    '''
1006    field_class = Nfield

Ndataset is a child class of Cdataset where internal value are NTV entities.

All the methods are the same as Sdataset.

Inherited Members
Sdataset
Sdataset
from_csv
from_file
merge
ext
consistent
extidx
extidxext
idxname
idxlen
iidx
lenidx
lidx
lisvar
lvar
lvarrow
lidxrow
primary
secondary
setidx
zip
addindex
append
applyfilter
coupling
delrecord
full
getduplicates
iscanonorder
isinrecord
idxrecord
keytoval
loc
mix
merging
orindex
record
recidx
recvar
setcanonorder
setfilter
sort
tostdcodec
updateindex
valtokey
tab_dataset.dataset_interface.DatasetInterface
json
plot
to_csv
to_dataframe
to_file
to_ntv
to_xarray
voxel
view
vlist
tab_dataset.cdataset.Cdataset
indexlen
iindex
keys
lenindex
lunicname
lunicrow
lname
tiindex
ntv
from_ntv
add
to_analysis
reindex
delindex
nindex
renameindex
reorder
setname
swapindex
check_relation
check_relationship
tab_dataset.cdataset.DatasetAnalysis
analysis
anafields
partitions
complete
dimension
lvarname
primaryname
secondaryname
indexinfos
field_partition
relation
tree
indicator