Math Is Fun Forum
  Discussion about math, puzzles, games and fun.   Useful symbols: ÷ × ½ √ ∞ ≠ ≤ ≥ ≈ ⇒ ± ∈ Δ θ ∴ ∑ ∫ • π ƒ -¹ ² ³ °

You are not logged in.

#1 Re: Coder's Corner » Cartesian Product...with a twist (need to remove duplicates) » 2011-08-06 11:30:27

pip

Thanks for the help bobbym.

You explain your solution well and I have a better understanding of this multiset topic now than I did a few days ago - understanding the maths that goes behind solving this real-world problem has been most helpful. I have managed to solve this issue using the LINQ to Dataset component of Microsoft's .NET framework.....I have posted my solution below in C# - I appreciate that this is a maths, not a programming, forum but as the two disciplines have huge areas of overlap this may come in handy for someone. If you have LinqPad, just paste the solution below in the window and run it.....it gives the right number of elements in the solution.

Thanks again bobby  big_smile

static void Main()
{
	var sequences = new List<List<int>> { new List<int> { 2, 3 }, new List<int> { 2, 3 }, new List<int> { 2, 3 }, new List<int> { 2, 3 }, new List<int> { 2, 3 } };

	var comparer = new UnorderedSequenceComparer();
	var distinctOutput =CartesianProduct(sequences, true, comparer);
	distinctOutput.Dump();
}

static IEnumerable<IEnumerable<T>> CartesianProduct<T>(IEnumerable<IEnumerable<T>> sequences, bool removeDuplicates, IEqualityComparer<IEnumerable<T>> sequenceComparer)
{
	IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
	var resultsSet = sequences.Aggregate(emptyProduct, (accumulator, sequence) => from accseq in accumulator
																				  from item in sequence
																				  select accseq.Concat(new[] { item }));

	if (removeDuplicates)
		return resultsSet.Distinct(sequenceComparer);
	else
		return resultsSet;
}

class UnorderedSequenceComparer : IEqualityComparer<IEnumerable<int>>
{
	public bool Equals(IEnumerable<int> x, IEnumerable<int> y)
	{
		return x.OrderBy(i => i).SequenceEqual(y.OrderBy(i => i));
	}

	public int GetHashCode(IEnumerable<int> obj)
	{
		return obj.Sum(i => i * i);
	}
}

#2 Re: Coder's Corner » Cartesian Product...with a twist (need to remove duplicates) » 2011-08-05 07:34:25

pip

bobbym,

I been reading more into linq today and it really does seem that it can do any set operations.

I'm intrigued by the whole two-commands-to-do-the-whole-thing statement - ok, what u got?:)

Cheers,

pip

#3 Re: Coder's Corner » Cartesian Product...with a twist (need to remove duplicates) » 2011-08-04 18:51:19

pip

Yes - that's it - 32 possible results in the result-set but there will only be 6 completely unique results if we take ordering out of the mix, i.e

2,2,2,2,2
2,2,2,2,3
2,2,2,3,3
2,2,3,3,3
2,3,3,3,3
3,3,3,3,3

The language I am using is C# under .NET3.5 and so I have access to all the set commands of LINQ, i.e. SELECT, WHERE, INTERSECT, UNION.....pretty much any set operation that is needed would be supported. I have asked this same question on a dedicated LINQ forum and quite a few programmatical answers have been posted - but I tend to be the sort of person who just HAS to know what is going on under the hood, so if you know what set operations can be used to resolve this I would still love to know.

Cheers,

Phil

#4 Re: Coder's Corner » Cartesian Product...with a twist (need to remove duplicates) » 2011-08-04 07:20:24

pip
bobbym wrote:

Hi;

You have a partial example of 5 elements. Can you provide a full example? I mean all the sets and each steps output. From beginning to end.
If that is too large how about sets with 4 elements. What I need is one problem completely solved with nothing left out.

It has been my experience that explanations of a process using words only is inaccurate.

Hi bobbym,

The example I give is actually complete - I do apologise though, as I just re-read my original post and I should state that the maximum number of lists within the ListOfLists is 5.

As stated in the original post though the minimum size of each inner list is 1, with a maximum size of 2. That is, the worst-case scenario from a combination point of view would be {{2,3}, {2,3}, {2,3}, {2,3}, {2,3}}, i.e. a list containing inner lists of the maximum size, and in this case there would obviously be 32 results in the Cartesian Product result-set, but the pruned result-set that I am trying to get at would just be:-

2,2,2,2,2
2,2,2,2,3  <-- all other results with four 2's and one 3 (in any order) are suppressed
2,2,2,3,3  <-- all other results with three 2's and two 3's are suppressed, etc
2,2,3,3,3
2,3,3,3,3
3,3,3,3,3

What i can't know until run-time of the program, is how many inner lists there will be, and how many of them will contain this choice of a 2 or 3, and that's why I (think) I need to compute the Cartesian product, then prune duplicates.

I hope I have explained that OK - the program I am working on has a seemingly infinite number of combinatorial quirks but this one has me stumped....technically, I have resolved the issue but I am essentially having to take each result in the cartesian product result-set, order it (in ascending order) for comparison, then take the first result from the result-set and compare it to all the others (that involves checking for a complete intersection of the two), add the current result being compared into a new result-set, delete the duplicates from the original result-set, then iterate from the start again, .....it works, it's just computationally intensive. I've tried to quickly pseudo-code this below....

foreach(input-item in input-set)
    sort the input-item in ascending order
end loop

create a new set (output-set)

while(input-set is not empty)
    remove first input-item from input-set
    add it to output-set
    
    foreach(input-item in input-set)
        compare the removed-item with input-item
        if(the two items are equal)
            remove the matched item from input-set

As I say, I have a working solution, but that pain in the math part of my brain just "knows" there is a more efficient way of doing this. If there is no mathematical way you know of, then no problem, I just wanted to satisfy my own curiosity, and a brush-up on my set-theory knowledge is no bad thing. smile

#5 Re: Coder's Corner » Cartesian Product...with a twist (need to remove duplicates) » 2011-08-04 05:41:28

pip
bobbym wrote:

Hi pip;

Welcome to the forum!

If you were asking for the number of something then that is a math problem but it seems to me that you are asking for how to prune the list further. This is a programming problem don't you think.

Cheers for the quick reply,


I see your point, and I have in fact managed to do this pruning programatically but my current solution is really a brute-force approach with many iterations and I was hoping there would be some set-operation that I am unaware of that could help do this more elegantly...us programmers are all about finding the elegant solution whenever we can. smile

#6 Coder's Corner » Cartesian Product...with a twist (need to remove duplicates) » 2011-08-04 04:24:16

pip
Replies: 12

Hi folks,

I hope someone is able to help me with what is, at least to me, quite a tricky algorithm.

===========
The Problem
===========

I have a List (of unknown size) of Lists (1<=size<=2) that I need to permute (or possibly combine.....forgive me, I am a programmer, not a mathematician). Here is an example of what I am looking at:-

ListOfLists = { {1}, {2,3}, {2,3}, {4}, {2,3} }

To give this some context, the ListOfLists is actually a list of hotel rooms. Each hotel room has a particular Occupancy (number of adults and children) and this Occupancy maps to a particular OccupancyID (which is what is contained inside each inner list. The problem here is that a room with 2 adults maps to 2 different occupancyID's (2=twin room, 3=double room). So, what I am saying in the example ListOfLists above is that I am looking for 5 rooms, 3 of which have 2 possible occupancyID's (2 or 3), the other 2 have clearly-defined occupancyID...

So, there are 2 stages to what I need to do:-

(1). I need to combine the inner lists in such a way that any combination has exactly ONE item from each list, that is, the possible combinations in the result set here would be:-

1,2,2,4,2
1,2,2,4,3
1,2,3,4,2
1,2,3,4,3
1,3,2,4,2
1,3,2,4,3
1,3,3,4,2
1,3,3,4,3

I have actually managed to find a solution to this stage using some of the excellent tutorials and blogs around the net (see below for links) - it is called the Cartesian Product as the title of this question might suggest.....now, here comes the twist which I can't figure out.

(2). I then need to filter out any duplicate results from this Cartesian Product. A duplicate in this case constitutes any line in the result set with the same quantity of each distinct list element as another line, that is,

1,2,2,4,3 is the "same" as 1,3,2,4,2

because each item within the first list occurs the same number of times in the second list (1 occurs once in each list, 2 appears twice in each list, ....

The final result set should therefore look like this...

1,2,2,4,2
1,2,2,4,3
    --                                 1,2,2,4,2
1,2,3,4,3      OR, simply     1,2,2,4,3 
    --                                 1,2,3,4,3
    --                                 1,3,3,4,3
    --
1,3,3,4,3

To any mathematically-minded folks out there, I do appreciate that what I am asking for might actually be achieved by a straight-forward combination or permutation, but I have not been able to find an answer to this particular situation. I hope someone is able to help.


Some resources used
blogs.msdn.com/b/ericlippert/archive/2010/06/28/computing-a-cartesian-product-with-linq.aspx
stackoverflow.com/questions/710670/c-permutation-of-an-array-of-arraylists
msdn.microsoft.com/en-us/vcsharp/aa336800.aspx

Board footer

Powered by FluxBB