How to identify UTF-8 one-byte spaces (C2A0)

KubotaNatsuki · July 2024

Hi!
I would like to use the Trim function to obtain up to 60 characters from the beginning.
However, if the 60th character is a one-byte space (C2A0) in UTF-8, it is not recognized by the Trim function, and as a result, a one-byte space is displayed in the 60th character.
Is there any way to avoid this?
If anyone knows, please help me.

PulkitChawla · July 2024

@KubotaNatsuki - did you try LEFT function? by putting formula something like - LEFT(Text,60). Can you try this and see if it gives you expected result?

KubotaNatsuki · July 2024

Thanks for the reply!
A one-byte space is displayed in the 60th character.
If the 60th character is a one-byte space, I would like to delete it and make it 59 characters, but it is difficult to identify if the 60th character is a one-byte space.
Is there any solution to this problem?

andrewtye · July 2024

Could you have a IF statement which is looking for the "space" as the last character and then left 59 otherwise do trim?

Amaya · July 2024

@KubotaNatsuki

とても面白く、学びがありました。いい質問をありがとうございます。
今まで遭遇したことの無いケースでしたが、再現できました。確かにTRIMではスペースを消すことができないですね。

またIF文で処理しようとしたとき、Functionの中には「(C2A0) =  」を書くことができない（貼り付けても普通のスペースに読み替えられてしまう」というAnaplanの仕様があるのだと理解しました。

だとすると、何とか「 」だけのLINEITEMを持つことができれば、それを使って置換などができるのではないかと思います。
ということで、下記のようにしてみました。
まず を「X」でサンドイッチした文字列をdummy stringとして作ります。そのdummy stringの「X」を””にSUBSTITUTEで置換することで、純粋な をもつLINEITEMを作ることができました。

ということで、60ケタに落とした後（上記B/C）の、" "を""に再度SUBSTITUTEで置換できるようになり、59ケタに落とすことができるという方法を考えましたがいかがでしょうか？

多くの人は使うことがなさそうなテクニックですが、楽しめました。
参考になれば幸いです。

太一

Amaya · July 2024

It was very interesting and I learned a lot. Thanks for the good questions.
I have never encountered this case before, but I was able to reproduce it. It is true that TRIM does not allow me to erase spaces.

I also understood that there is an Anaplan specification that "(C2A0) = " cannot be written in a Function when trying to process it with an IF statement (even if pasted, it will be read as a normal space).

If that is the case, I think that if we could somehow have a LINEITEM with only " ", we could use it to do replacements, etc.
So I tried the following.
First, create a dummy string by sandwiching with "X". By replacing the "X" in that dummy string with "" by SUBSTITUTE, we were able to create a LINEITEM with pure " ".

So, after dropping to 60 digits (B/C above), what do you think of the method of being able to replace "" with "" again by SUBSTITUTE and drop to 59 digits?

I know this is a technique that not many people seem to use, but I enjoyed it.
I hope this is helpful.

Taichi

How to identify UTF-8 one-byte spaces (C2A0)

Welcome!

Answers

Welcome!

Welcome!

Quick Links

Categories