Math Is Fun Forum
  Discussion about math, puzzles, games and fun.   Useful symbols: ÷ × ½ √ ∞ ≠ ≤ ≥ ≈ ⇒ ± ∈ Δ θ ∴ ∑ ∫ • π ƒ -¹ ² ³ °

You are not logged in.

#1 2011-08-04 04:24:16

pip
Member
Registered: 2011-08-04
Posts: 6

Cartesian Product...with a twist (need to remove duplicates)

Hi folks,

I hope someone is able to help me with what is, at least to me, quite a tricky algorithm.

===========
The Problem
===========

I have a List (of unknown size) of Lists (1<=size<=2) that I need to permute (or possibly combine.....forgive me, I am a programmer, not a mathematician). Here is an example of what I am looking at:-

ListOfLists = { {1}, {2,3}, {2,3}, {4}, {2,3} }

To give this some context, the ListOfLists is actually a list of hotel rooms. Each hotel room has a particular Occupancy (number of adults and children) and this Occupancy maps to a particular OccupancyID (which is what is contained inside each inner list. The problem here is that a room with 2 adults maps to 2 different occupancyID's (2=twin room, 3=double room). So, what I am saying in the example ListOfLists above is that I am looking for 5 rooms, 3 of which have 2 possible occupancyID's (2 or 3), the other 2 have clearly-defined occupancyID...

So, there are 2 stages to what I need to do:-

(1). I need to combine the inner lists in such a way that any combination has exactly ONE item from each list, that is, the possible combinations in the result set here would be:-

1,2,2,4,2
1,2,2,4,3
1,2,3,4,2
1,2,3,4,3
1,3,2,4,2
1,3,2,4,3
1,3,3,4,2
1,3,3,4,3

I have actually managed to find a solution to this stage using some of the excellent tutorials and blogs around the net (see below for links) - it is called the Cartesian Product as the title of this question might suggest.....now, here comes the twist which I can't figure out.

(2). I then need to filter out any duplicate results from this Cartesian Product. A duplicate in this case constitutes any line in the result set with the same quantity of each distinct list element as another line, that is,

1,2,2,4,3 is the "same" as 1,3,2,4,2

because each item within the first list occurs the same number of times in the second list (1 occurs once in each list, 2 appears twice in each list, ....

The final result set should therefore look like this...

1,2,2,4,2
1,2,2,4,3
    --                                 1,2,2,4,2
1,2,3,4,3      OR, simply     1,2,2,4,3 
    --                                 1,2,3,4,3
    --                                 1,3,3,4,3
    --
1,3,3,4,3

To any mathematically-minded folks out there, I do appreciate that what I am asking for might actually be achieved by a straight-forward combination or permutation, but I have not been able to find an answer to this particular situation. I hope someone is able to help.


Some resources used
blogs.msdn.com/b/ericlippert/archive/2010/06/28/computing-a-cartesian-product-with-linq.aspx
stackoverflow.com/questions/710670/c-permutation-of-an-array-of-arraylists
msdn.microsoft.com/en-us/vcsharp/aa336800.aspx

Offline

#2 2011-08-04 05:21:04

bobbym
Administrator
From: Bumpkinland
Registered: 2009-04-12
Posts: 87,238

Re: Cartesian Product...with a twist (need to remove duplicates)

Hi pip;

Welcome to the forum!

If you were asking for the number of something then that is a math problem but it seems to me that you are asking for how to prune the list further. This is a programming problem don't you think.


In mathematics, you don't understand things. You just get used to them.
Of course that result can be rigorously obtained, but who cares?
Combinatorics is Algebra and Algebra is Combinatorics.

Offline

#3 2011-08-04 05:41:28

pip
Member
Registered: 2011-08-04
Posts: 6

Re: Cartesian Product...with a twist (need to remove duplicates)

bobbym wrote:

Hi pip;

Welcome to the forum!

If you were asking for the number of something then that is a math problem but it seems to me that you are asking for how to prune the list further. This is a programming problem don't you think.

Cheers for the quick reply,


I see your point, and I have in fact managed to do this pruning programatically but my current solution is really a brute-force approach with many iterations and I was hoping there would be some set-operation that I am unaware of that could help do this more elegantly...us programmers are all about finding the elegant solution whenever we can. smile

Offline

#4 2011-08-04 05:47:31

bobbym
Administrator
From: Bumpkinland
Registered: 2009-04-12
Posts: 87,238

Re: Cartesian Product...with a twist (need to remove duplicates)

Hi;

You have a partial example of 5 elements. Can you provide a full example? I mean all the sets and each steps output. From beginning to end.
If that is too large how about sets with 4 elements. What I need is one problem completely solved with nothing left out.

It has been my experience that explanations of a process using words only is inaccurate.


In mathematics, you don't understand things. You just get used to them.
Of course that result can be rigorously obtained, but who cares?
Combinatorics is Algebra and Algebra is Combinatorics.

Offline

#5 2011-08-04 07:20:24

pip
Member
Registered: 2011-08-04
Posts: 6

Re: Cartesian Product...with a twist (need to remove duplicates)

bobbym wrote:

Hi;

You have a partial example of 5 elements. Can you provide a full example? I mean all the sets and each steps output. From beginning to end.
If that is too large how about sets with 4 elements. What I need is one problem completely solved with nothing left out.

It has been my experience that explanations of a process using words only is inaccurate.

Hi bobbym,

The example I give is actually complete - I do apologise though, as I just re-read my original post and I should state that the maximum number of lists within the ListOfLists is 5.

As stated in the original post though the minimum size of each inner list is 1, with a maximum size of 2. That is, the worst-case scenario from a combination point of view would be {{2,3}, {2,3}, {2,3}, {2,3}, {2,3}}, i.e. a list containing inner lists of the maximum size, and in this case there would obviously be 32 results in the Cartesian Product result-set, but the pruned result-set that I am trying to get at would just be:-

2,2,2,2,2
2,2,2,2,3  <-- all other results with four 2's and one 3 (in any order) are suppressed
2,2,2,3,3  <-- all other results with three 2's and two 3's are suppressed, etc
2,2,3,3,3
2,3,3,3,3
3,3,3,3,3

What i can't know until run-time of the program, is how many inner lists there will be, and how many of them will contain this choice of a 2 or 3, and that's why I (think) I need to compute the Cartesian product, then prune duplicates.

I hope I have explained that OK - the program I am working on has a seemingly infinite number of combinatorial quirks but this one has me stumped....technically, I have resolved the issue but I am essentially having to take each result in the cartesian product result-set, order it (in ascending order) for comparison, then take the first result from the result-set and compare it to all the others (that involves checking for a complete intersection of the two), add the current result being compared into a new result-set, delete the duplicates from the original result-set, then iterate from the start again, .....it works, it's just computationally intensive. I've tried to quickly pseudo-code this below....

foreach(input-item in input-set)
    sort the input-item in ascending order
end loop

create a new set (output-set)

while(input-set is not empty)
    remove first input-item from input-set
    add it to output-set
    
    foreach(input-item in input-set)
        compare the removed-item with input-item
        if(the two items are equal)
            remove the matched item from input-set

As I say, I have a working solution, but that pain in the math part of my brain just "knows" there is a more efficient way of doing this. If there is no mathematical way you know of, then no problem, I just wanted to satisfy my own curiosity, and a brush-up on my set-theory knowledge is no bad thing. smile

Last edited by pip (2011-08-04 07:20:58)

Offline

#6 2011-08-04 07:31:25

bobbym
Administrator
From: Bumpkinland
Registered: 2009-04-12
Posts: 87,238

Re: Cartesian Product...with a twist (need to remove duplicates)

Okay, I will think on it. But it may be that the computationally expensive way is the only way, so hold on to what you have!

This what I am getting for you cartesian product for the example you cited.

What set commands does your language support?


In mathematics, you don't understand things. You just get used to them.
Of course that result can be rigorously obtained, but who cares?
Combinatorics is Algebra and Algebra is Combinatorics.

Offline

#7 2011-08-04 18:51:19

pip
Member
Registered: 2011-08-04
Posts: 6

Re: Cartesian Product...with a twist (need to remove duplicates)

Yes - that's it - 32 possible results in the result-set but there will only be 6 completely unique results if we take ordering out of the mix, i.e

2,2,2,2,2
2,2,2,2,3
2,2,2,3,3
2,2,3,3,3
2,3,3,3,3
3,3,3,3,3

The language I am using is C# under .NET3.5 and so I have access to all the set commands of LINQ, i.e. SELECT, WHERE, INTERSECT, UNION.....pretty much any set operation that is needed would be supported. I have asked this same question on a dedicated LINQ forum and quite a few programmatical answers have been posted - but I tend to be the sort of person who just HAS to know what is going on under the hood, so if you know what set operations can be used to resolve this I would still love to know.

Cheers,

Phil

Offline

#8 2011-08-04 19:46:16

bobbym
Administrator
From: Bumpkinland
Registered: 2009-04-12
Posts: 87,238

Re: Cartesian Product...with a twist (need to remove duplicates)

Hi;

I can get the result you want in two commands but C# may not support them.


In mathematics, you don't understand things. You just get used to them.
Of course that result can be rigorously obtained, but who cares?
Combinatorics is Algebra and Algebra is Combinatorics.

Offline

#9 2011-08-05 07:34:25

pip
Member
Registered: 2011-08-04
Posts: 6

Re: Cartesian Product...with a twist (need to remove duplicates)

bobbym,

I been reading more into linq today and it really does seem that it can do any set operations.

I'm intrigued by the whole two-commands-to-do-the-whole-thing statement - ok, what u got?:)

Cheers,

pip

Offline

#10 2011-08-05 09:15:12

René Descartes
Guest

Re: Cartesian Product...with a twist (need to remove duplicates)

Equi-join

#11 2011-08-05 09:24:56

bobbym
Administrator
From: Bumpkinland
Registered: 2009-04-12
Posts: 87,238

Re: Cartesian Product...with a twist (need to remove duplicates)

Hi René Descartes;

Welcome to the forum! Please do not do that. Terse answers like that cause much confusion. If you have an idea that can help pip please take the time to explain it in detail.


Hi pip;

Yes there is a two step method that uses a set command but it may be of very little use to you. Lisp is unusually rich in list manipulating commands. Mathematica which is a functional language like lisp is what I use now. Procedural languages may not be able to easily duplicate this idea.

The list of lists that we are working on is:

Look at element 2 and element 3

{2,2,2,2,3} and {2,2,2,3,2}

They are duplicates in structure of each but they are permutations of each other. They are not exactly the same. Notice the fact that the 2 is shifted over.

Step 1) Sort each element of s, this can be expensive or not depending on your language. In mathematica it is not. Also, if I were you I would try to have each internal list ( { 2,2,2,3,2} )( element ) sorted on construction. Anyway, mathematica does it like this:

Sort[#]& /@ s

Notice the use of the lambda calculus ( pure function )

The output is:

Call the above s1.

Now notice all the duplicates are exact duplicates. We can now use the fact that each list of 5 is an element of s1. The set theoretic command Union is great for eliminating duplicates.

We have the multiset { 1,2,2,3,3,4 } which we union with itself, the operation has removed all the duplicates. In mathematica

Union[s1] or Union[s1, s1]

This will yield:

This is what you desire. In short what I have done is this.

1) Sort each list of lists, these elements can each be thought of as 1 x 5 vectors. The 5 elements in each of these must be sorted when they are created or after. Call this list of lists s1.

2) Union s1 with itself, this removes duplicates. At least in functional languages. What C# will do is unknown.

You would have to experiment to see if your compiler supports commands that produce the desired effect.


In mathematics, you don't understand things. You just get used to them.
Of course that result can be rigorously obtained, but who cares?
Combinatorics is Algebra and Algebra is Combinatorics.

Offline

#12 2011-08-06 11:30:27

pip
Member
Registered: 2011-08-04
Posts: 6

Re: Cartesian Product...with a twist (need to remove duplicates)

Thanks for the help bobbym.

You explain your solution well and I have a better understanding of this multiset topic now than I did a few days ago - understanding the maths that goes behind solving this real-world problem has been most helpful. I have managed to solve this issue using the LINQ to Dataset component of Microsoft's .NET framework.....I have posted my solution below in C# - I appreciate that this is a maths, not a programming, forum but as the two disciplines have huge areas of overlap this may come in handy for someone. If you have LinqPad, just paste the solution below in the window and run it.....it gives the right number of elements in the solution.

Thanks again bobby  big_smile

static void Main()
{
	var sequences = new List<List<int>> { new List<int> { 2, 3 }, new List<int> { 2, 3 }, new List<int> { 2, 3 }, new List<int> { 2, 3 }, new List<int> { 2, 3 } };

	var comparer = new UnorderedSequenceComparer();
	var distinctOutput =CartesianProduct(sequences, true, comparer);
	distinctOutput.Dump();
}

static IEnumerable<IEnumerable<T>> CartesianProduct<T>(IEnumerable<IEnumerable<T>> sequences, bool removeDuplicates, IEqualityComparer<IEnumerable<T>> sequenceComparer)
{
	IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
	var resultsSet = sequences.Aggregate(emptyProduct, (accumulator, sequence) => from accseq in accumulator
																				  from item in sequence
																				  select accseq.Concat(new[] { item }));

	if (removeDuplicates)
		return resultsSet.Distinct(sequenceComparer);
	else
		return resultsSet;
}

class UnorderedSequenceComparer : IEqualityComparer<IEnumerable<int>>
{
	public bool Equals(IEnumerable<int> x, IEnumerable<int> y)
	{
		return x.OrderBy(i => i).SequenceEqual(y.OrderBy(i => i));
	}

	public int GetHashCode(IEnumerable<int> obj)
	{
		return obj.Sum(i => i * i);
	}
}

Last edited by pip (2011-08-06 11:32:30)

Offline

#13 2011-08-06 12:39:18

bobbym
Administrator
From: Bumpkinland
Registered: 2009-04-12
Posts: 87,238

Re: Cartesian Product...with a twist (need to remove duplicates)

H pip;

Glad you found a solution! Good luck with the project.


In mathematics, you don't understand things. You just get used to them.
Of course that result can be rigorously obtained, but who cares?
Combinatorics is Algebra and Algebra is Combinatorics.

Offline

Board footer

Powered by FluxBB