Extracting text field from transactional data load module

I have a question regarding pulling a text-formatted field (e.g., a status entry that could be but currently is not list-formatted) from a large transactional data load module that consists of CSV-imported line items and then various data mappings to lists, numbers, etc. and some RANK and MAX cleverness to focus in on the data rows most important to me.  The codes for these data rows are essentially composite keys that I can't fully reproduce from my target module. 

 

Source (Transactional) Module: (Dimensions: ROWID x line items)

 

{ROWID}  [Imported data fields...] [Project ID] [ProjectStage] [Project Status] [Data Mappings] ["RANK-Filtered" ID]

ABC001               ...                        123 - A           2A                     Happy                   ...                  BLANK

ABC002               ...                        234 - B           1                       Happy                   ...                  234-B

...                         ...                         ...                   ...                      ...                          ...                   ...

ABC456              ...                         123 - A           2A                     Happy                  ...                   123-A

...                         ...                         ...                   ...                      ...                          ...                   ...

ABC999               ...                        123 - A           3                       Sad                      ...                   123-A

 

 

My real goal is to pull the Status values into the target module based on the Project ID module dimension and Project Stage line item (list-formatted from another minimally-dimensioned data module whose row ID happens to be the CODE value for the Project ID list entry - pulling the Project Stage was relatively straightforward as a result).

Target Reporting Module: (Dimensions (Project ID x Other_List x Line Items)

 

Other_List_1 v

{Project ID}   [Other Calculated Values] [ProjectStage] [Project Status]  <--- Trying to populate this line item

123-A                           ...                             2A                    (Happy)

123-A                           ...                             3                      (Sad)                                   

...                                 ...                             ...                      ... 

234-B                          ...                             1                       (Happy)

 

 

Combinations of LOOKUP and SELECT aren't getting it done, and other data pulled out of the source module makes heavy use of SUM, but I need text data.

Formulae tried:                                                                              Outcome:

 

'Source Module'.Project Status                                                     No error, but no data

'Source Module'.Project Status[LOOKUP:ProjectStage]              Thinks, then "Format of mapping used for lookup doesn't match any dimension of the source"

'Source Module'.Project Status[SELECT:ProjectStage]                Invalid: "The name Gate used in the SELECT clause is not recognized as a list member or version in the model"

 

Thoughts?

 

Thanks in advance,

 

Best Answer

  • wquan
    Answer ✓

    Hi There,

     

    You can try using the firstnonblank or lastnonblank functions... but if you have the same project ID and Project stage it'll either pick the first or last down the ROWID hierarchy.

     

    Example is: 'Source Module'.Project Status[FIRSTNONBLANK:ProjectStage, FIRSTNONBLANK: Project ID]    

     

    Make sure that the Project ID is a line item in the module with a format of Project ID and a formula of ITEM('Project ID')

Answers

  • Thank you so much for this suggestion!  LASTNONBLANK is taking care of me nicely once I got my ducks in a row on the naming and such. 

     

    Something that isn't made clear in the Anapedia documentation is the requirement for the names of the matching criterion column having to match, not merely have identical list formatting.

     

    Also, I assume that using Sort on the module data entries has no impact on the output of FIRSTNONBLANK/LASTNONBLANK because it's only governed by the first/last occurrence as viewed with the default "un-Sort-ed" row order, and not, for instance, by the sorted order of the data in the last View shown when the formula was entered.

     

    Thank you again for your help!

  • Yep! From my usage of it, it's based on the order of the member in the list.

     

    Good luck with it all.

  • This approach also just helped me solve a formula error with lookup! Anyone know why firstnonblank works in this scenario and why lookup was not able to get the job done? 

     

    For my scenario I had 2 important columns

     

    Dimensions (ROW ID x Line Items (Transaction ID))

     

    ROW ID  TRANSACTION ID

    1              A

    2              B

    3              C

    4              D

     

    AND THEN TARGET MODULE

     

    Dimensions Transaction ID x Line Items (ROW ID)

     

    TRANSACTION ID         ROW ID

    A                                        ?

    B                                        ?

    C                                        ?

    D                                        ?

     

    The formula that ended up working for me was: SourceModule.ROWID[FIRSTNONBLANK: SourceModule.TRANSACTIONID]

  • Tagging @wquan @JaredDolich if you guys have further insight?

  • @DaanishSoomar 

    My best guess would be that LOOKUP requires a list formatted line item in your TARGET module or from a SYSTEM module that uses the Dimension from the Source. FIRSTNONBLANK is like a SUM function where the list formatted line item is in the SOURCE or SYSTEM module that uses the Dimension of the TARGET. Good use case to test because it happens a lot.

  • @DaanishSoomar 

     

    @JaredDolich is correct in that the source has to have a list formatted line item that is shared in the target, in this case transactional ID.  The other question, from an architectural perspective, is why are you using ROW ID instead of a unique code which has meaning?

     

    Rob

  • Thanks @JaredDolich @rob_marshall
    And yes you are asking the right question. The goal is eventually to use a unique ID that has meaning. Working in an outdated model, helping the model rebuild according to best practices.

    This is a temporary solution until we get our ETL tool ready to go which will then help make row ID become obsolete!