Home
Home
Apps
Apps
Ask
Ask
Manual
Manual
Share
Share

Removing duplicates from a list


#1

It seems this should be simple enough but if isn’t. I have a list from which I want to extract the unique values without reordering. I am attempting to implement the Elememt command . I am stepping throng the source list and if a value does not exist in the output list I wish to append it.

Assume c is a list with duplicates and c3 is the output list without duplicates.

The first example uses a simple If statement. When evaluated c3 contains all of c including duplicates.

c3=[]
for x in c
if !(c3 @? x)
c3 @= x
end
end

The second form, using the Conditional operator, crashes MathStudio.

c3=[]
For x in c
!(c3 @? x) ? c3 @= x : 0
end

How can I make this work?

MathStudio 7.3.2 under MacOS Mojave


#2

Curious, the first algorithm works fine for me, iMac latest Mojave.

Can you give an example of the case it fails on?


#3

This is a fail example.
For the first test I sort the list and look for the change of value.

For the second test I use :

if !(c3 @? x)
c3 @= x

The list returns unsorted.

In the third case I use: !(c4 @? x) ? c4 @= x : 0 to search for the duplicates. This one crashes MathStudio

Polygon areas simple case.math (608 Bytes)


#4

I tried your example on the web version with similar results. I then inserted a trace() command just to see how c3 and c4 were being formed.

I got an unexpected result on the values from trace for variable c3.

I was unable to trace c4 as the program crashes. I am attaching a copy of what I did and how I used the numerical command to clean up the c3 entries.

Test Trace


#5

It appears that the list c is being passed to the comparison unevaluated in the second case but not in the first. That means that LCM(n,m) is not evaluated in the initial list. Sort© must evaluate c before sorting else it couldn’t sort. That’s why that comparison works.

The other two cases operate on a list of LCM() and the comparison fails. The final case probably crashes MathStudio due to a type mismatch in the Element statement.

I tested several variations including evaluating the comparison before the If() statement. Even when the comparison yields 0 the x value seems to be appended to the output list. There is an issue with If() or the append operator @= or both.

after a bit of consideration:

If I interpret the behavior of @= correctly instead of appending the value at the current location of the list pointer in For x in c, it replaces c3 with c(1…current position of pointer)

Nope, I’m wrong. The script fails only for some instances of the comparison.

The first failure is the second occurrence of 12 which @? fails to find in the list c3 appends it to the list. After which it evaluates the 4 which is not in c3 but evaluates to 1.

A similar event occurs after it evaluates 28 then fails to append the 8.

Holy cow, Is it failing on 4 and 8 because 24 & 28 are in c3?

Nope, It also fails on 5 and 10.

There is something very screwy happening here.

remove duplicates.If only.math (884 Bytes)


#6

Indeed! I created a couple of lists and tested each element in the list to see if it were in the list. If it was, the positive value was returned. If it was not, the negative. As you can see from the attached, the element operator is, mathematically speaking, wonky. Some lists are OK, some are not.

I can write my own is an element of fcn and all works properly. We need to report this as a bug. BTW, the results are the same on my iPad Pro with the latest iOS. So it is not unique to the Mac, but likely a glitch in the engine somewhere.

removeDuplicates.math (2.2 KB)


#7

I also ran on iPad Pro with same results.

I spent some time trying to find the point of failure and wasn’t successful.

As I see it there are two issues. The elementOf (Contains) bug and the unevaluated function in the original list. I may beat on it some more or spend the evening binging out on the second season of Patriot on Amazon.

BTW, I intend to plagiarize your function. :sunglasses:

Here is a little script you might find useful. Ir finds the position of an element in a list
position.math (158 Bytes)


#8

No problem. MS bugs can be infuriating at times, but the neat thing about MS is that you can always write your own fcn as a workaround. So for the most part they are simply annoyances.

As to the unevaluated function in the original list, I cannot say for certain if this is the case here, but one thing you need to keep in mind is that MS is inherently a SYMBOLIC algebra engine. It will evaluate expressions symbolically and only produce numerical results when the time comes and if the answer is to be displayed numerically. This allows MS to evaluate expressions “exactly” in many cases. For example, Sin(pi/4) is really represented symbolically as sqrt(2)/2 and not as a floating point number. If MS has no good way to evaluate an expression symbolically, it will result to numerical methods. You can sort of see this in how MS displays results, either symbolically or numerically. (or hybrid, which really means symbolically with numerical evaluation)

Anyway, one can control this symbolic vs numerical mode using the explicit fcn Number() for particular results, or the Numerical command, which applies to the entire entry. This later is useful for multiline entries.

I once wrote a PSD Lomb function that was puzzlingly slow. It was fast for a small set of data, but instead of a semi-linear execution time wrt to data set size, it was oddly exponential. It also was hitting memory limits for modest data sets. After a bit of digging, (using Trace) this is when I discovered it was doing everything symbolically, which was pointless for such a fcn. I found I could radically speed up the routine with the Numerical command in the fcn. (There is a thread somewhere on a FFT someone was doing where this was discussed).

Usually it does not matter and one can let MS do its thing as it sees fit, but in some cases the distinction is important! So be advised.

Anyway here is a file of examples. Unfortunately using Numerical with your function does not fix the issue, so I suspect this is just a weird bug in the engine.

Example.math (1.3 KB)


#9

I regularly wrote scripts for MS when it was SpaceTime, mostly for matrix manipulation, but the engine became unreliable about the time of the name change and I defaulted to Mathematica. I’ve begun using MS again because of the portability between MacOS and iOs.