Python itertools permutations narrowing down results by indices comparison, not working as expected

https://stackoverflow.com/questions/20592526

01-09-2022
|

Domanda

Somewhat python newb here trying to figure out why my code is not giving the expected result. First the code:

from itertools import permutations

word_list = ['eggs', ',', 'bacon', ',', 'chicken', ',', 'cheese', 'and', 'tomatoes']
grammar_list = ['NOUN', ',', 'NOUN', ',', 'NOUN', ',', 'NOUN', 'AND', 'NOUN']

def permute_nouns():
    permuted_list = []
    comma_AND_indices = [index for index, p in enumerate(grammar_list) if p == "," or p == "AND"]
    # so 'comma_AND_indices' = [1, 3, 5, 7]

    for perm in permutations(word_list):
        observed_comma_AND_indices = [index for index, p in enumerate(perm) if p == "," or p == "and"]
        if comma_AND_indices == observed_comma_AND_indices:
            # what goes wrong here? not matches from list compare above still get appended below.
            permuted_list.append(perm)

    print permuted_list

permute_nouns()

In this function I am using the itertools permutations method to create permutations of the word_list. However, I do not want all permutations. I only want the permutations where the commas and the word 'and' maintain their original position/indices in the word_list, and to append these to the permuted_list.

I am using the code line if comma_AND_indices == observed_comma_AND_indices: to filter out the those permutations I do not want, but it is not working and I do not understand why. On printing out the permuted_list I find that the commas and the 'and' are not preserved, but all permutations are appended.

(You may be wondering why bother with using the grammar_list in the function, but the code here is part of a slightly bigger script in which the grammar_list plays its role)

Any help to put light on this appreciated.

Darren

EDIT: Here is a sample of what is printing out for me:

[('eggs', ',', 'bacon', ',', 'chicken', ',', 'cheese', 'and', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'chicken', ',', 'tomatoes', 'and', 'cheese'), ('eggs', ',', 'bacon', ',', 'chicken', 'and', 'cheese', ',', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'chicken', 'and', 'tomatoes', ',', 'cheese'), ('eggs', ',', 'bacon', ',', 'cheese', ',', 'chicken', 'and', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'cheese', ',', 'tomatoes', 'and', 'chicken'), ('eggs', ',', 'bacon', ',', 'cheese', 'and', 'chicken', ',', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'cheese', 'and', 'tomatoes', ',', 'chicken'), ('eggs', ',', 'bacon', ',', 'tomatoes', ',', 'chicken', 'and', 'cheese'), ('eggs', ',', 'bacon', ',', 'tomatoes', ',', 'cheese', 'and', 'chicken'), ('eggs', ',', 'bacon', ',', 'tomatoes', 'and', 'chicken', ',', 'cheese'), ('eggs', ',', 'bacon', ',', 'tomatoes', 'and', 'cheese', ',', 'chicken'), ('eggs', ',', 'bacon', ',', 'chicken', ',', 'cheese', 'and', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'chicken', ',', 'tomatoes', 'and', 'cheese'), ('eggs', ',', 'bacon', ',', 'chicken', 'and', 'cheese', ',', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'chicken', 'and', 'tomatoes', ',', 'cheese'), ('eggs', ',', 'bacon', ',', 'cheese', ',', 'chicken', 'and', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'cheese', ',', 'tomatoes', 'and', 'chicken'), ('eggs', ',', 'bacon', ',', 'cheese', 'and', 'chicken', ',', 'tomatoes'), ('eggs', ',', 'bacon', ',', 'cheese', 'and', 'tomatoes', ',', 'chicken'), ('eggs', ',', 'bacon', ',', 'tomatoes', ',', 'chicken', 'and', 'cheese'), ('eggs', ',', 'bacon', ',', 'tomatoes', ',', 'cheese', 'and', 'chicken'), ('eggs', ',', 'bacon', ',', 'tomatoes', 'and', 'chicken', ',', 'cheese'), ('eggs', ',', 'bacon', ',', 'tomatoes', 'and', 'cheese', ',', 'chicken'), ('eggs', ',', 'bacon', 'and', 'chicken', ',', 'cheese', ',', 'tomatoes'), ('eggs', ',', 'bacon', 'and', 'chicken', ',', 'tomatoes', ',', 'cheese'), ('eggs', ',', 'bacon', 'and', 'chicken', ',', 'cheese', ',', 'tomatoes'), ('eggs', ',', 'bacon', 'and', 'chicken', ',', 'tomatoes', ',', 'cheese'), ('eggs', ',', 'bacon', 'and', 'cheese', ',', 'chicken', ',', 'tomatoes'), ('eggs', ',', 'bacon', 'and', 'cheese', ',', 'tomatoes', ',', 'chicken'), ('eggs', ',', 'bacon', 'and', 'cheese', ',', 'chicken', ',', 'tomatoes'), ('eggs', ',', 'bacon', 'and', 'cheese', ',', 'tomatoes', ',', 'chicken'), ('eggs', ',', 'bacon', 'and', 'tomatoes', ',', 'chicken', ',', 'cheese'), ('eggs', ',', 'bacon', 'and', 'tomatoes', ',', 'cheese', ',', 'chicken'), ('eggs', ',', 'bacon', 'and', 'tomatoes', ',', 'chicken', ',', 'cheese'), ('eggs', ',', 'bacon', 'and', 'tomatoes', ',', 'cheese', ',', 'chicken'), ('eggs', ',', 'chicken', ',', 'bacon', ',', 'cheese', 'and', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'bacon', ',', 'tomatoes', 'and', 'cheese'), ('eggs', ',', 'chicken', ',', 'bacon', 'and', 'cheese', ',', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'bacon', 'and', 'tomatoes', ',', 'cheese'), ('eggs', ',', 'chicken', ',', 'cheese', ',', 'bacon', 'and', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'cheese', ',', 'tomatoes', 'and', 'bacon'), ('eggs', ',', 'chicken', ',', 'cheese', 'and', 'bacon', ',', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'cheese', 'and', 'tomatoes', ',', 'bacon'), ('eggs', ',', 'chicken', ',', 'tomatoes', ',', 'bacon', 'and', 'cheese'), ('eggs', ',', 'chicken', ',', 'tomatoes', ',', 'cheese', 'and', 'bacon'), ('eggs', ',', 'chicken', ',', 'tomatoes', 'and', 'bacon', ',', 'cheese'), ('eggs', ',', 'chicken', ',', 'tomatoes', 'and', 'cheese', ',', 'bacon'), ('eggs', ',', 'chicken', ',', 'bacon', ',', 'cheese', 'and', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'bacon', ',', 'tomatoes', 'and', 'cheese'), ('eggs', ',', 'chicken', ',', 'bacon', 'and', 'cheese', ',', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'bacon', 'and', 'tomatoes', ',', 'cheese'), ('eggs', ',', 'chicken', ',', 'cheese', ',', 'bacon', 'and', 'tomatoes'), ('eggs', ',', 'chicken', ',', 'cheese', ',', 'tomatoes', 'and', 'bacon'), ('eggs', ',', 'chicken', ',', 'cheese', 'and', 'bacon', ',', 'tomatoes'),

Soluzione

Your code works just fine, albeit you could generate the same list faster an more concisely with a product() of the permutations of [','] + 3 + ['and'] and [w for w in word_list if w not in (',', 'and')] here, producing the same 120 * 24 = 2880 combinations.

If you were expecting only 120 results, then you are forgetting that you are not testing the order of the 3 commas and the word 'and' in your output; there are 24 different permutations of that list allowed:

>>> len(list(permutations([','] * 3 + ['and'])))
24

In other words, for any given permutation of just the nouns you are producing 24 variations of the sentence with the 3 commas and the word and in different locations.

To produce just the 120 combinations of the nouns:

nouns = [w for w in word_list if w not in (',', 'and')]
grammar = [w for w in word_list if w in (',', 'and')]
result = []
for perm in permutations(nouns):
    result.append([w for word, g in map(None, perm, grammar) for w in (word, g) if w is not None])

Altri suggerimenti

If duplicates didn't matter, you could've just used itertools.product:

for words in itertools.product(*(['a'], ['big', 'fat'], ['dog', 'house'])):
    print(' '.join(words))

Which prints:

a big dog
a big house
a fat dog
a fat house

But since they do, you have to do something a little more complicated:

import itertools
import collections

grammar = ['NOUN', ',', 'NOUN', ',', 'NOUN', ',', 'NOUN', 'AND', 'NOUN']
parts_of_speech = {
    'NOUN': ['eggs', 'bacon', 'chicken', 'cheese', 'tomatoes'],
    'AND': ['and'],
    ',': [',']
}

def partial_sentences(words, indices, sentence_length):
    if len(indices) > len(words):
        orderings = itertools.product(words, repeat=len(indices))
    else:
        orderings = itertools.permutations(words, len(indices))

    for words in orderings:
        sentence = [None] * sentence_length

        for index, word in zip(indices, words):
            sentence[index] = word

        yield sentence

def pos_stacks(parts_of_speech, grammar):
    positions = collections.defaultdict(list)

    for index, pos in enumerate(grammar):
        positions[pos].append(index)

    for pos, indices in positions.items():
        yield partial_sentences(parts_of_speech[pos], indices, len(grammar))

for result in itertools.product(*pos_stacks(parts_of_speech, grammar)):
    sentence = [next(itertools.ifilter(bool, words)) for words in zip(*result)]

    print(sentence)

It essentially creates all possible orderings of the words in their proper positions, loops through all the parts of speech, and "stacks" together the sentences.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow