How to use Python While with Assignment[4 Examples]

In this Python tutorial, you will learn “ How to use Python While with Assignment “, with multiple examples and different approaches.

While working on a project, I was assigned to optimise the code. In the research, I found that we can assign variables within a while loop in Python. This helps to reduce some lines of code.

Generally, we use the ” = “ assignment operator to assign the variable in Python. But we will also try the walrus operator “:= “ to assign the variable within the while loop in Python.

Let’s understand every example one by one with some practical scenarios.

Table of Contents

Python while loop with the assignment using the “=” operator

First, we will use the “=” operator, an assignment operator in Python. This operator is used to assign a value to a variable, and we will assign a variable inside the loop in Python using the ” = “ operator.

Let’s see how Python, while with assignment, works.

Python while loop with the assignment

In the above code, we always initialize the while loop with the true condition. “ while True :” means it will iterate infinite times and then assign the variable ‘line = “Hello” inside the while loop with one default string value.

Then assigned i = 0 , and again initialize nested while loop “print(line[i],’-‘,ord(line[i])) ” to target every character one by one and print its ascii value using ord() method in Python

Python assign variables in the while condition using the walrus operator

We can also use the walrus operator ” := “, a new assignment expression operator introduced in the 3.8 version of Python. It can create a new variable inside the expression even if that variable does not exist previously.

Let’s see how we can use it in a while loop:

Python assign variable in while condition using walrus operator

In the above code, we have a list named capitals. Then we initialize a while loop and create current_capital , giving a value as capitals.pop(0) , which means removing the first element in every iteration using the walrus operator like this: ‘current_capital:= capitals.pop(0)) != “Austin” .

When it iterates at “Austin”, the condition will be False because “Austin” != “Austin” will return false, and the loop will stop iterating.

Python while with assignment by taking user input

Here, we will see how Python While with Assignment will work if we take user input with a while loop using the walrus operator in Python.

python while assignment with walrus operator in python

In the above code, we are initializing a variable with a while loop and taking user input for an integer. Then, it will check whether the number is even or odd using the % operator.

Look at how it asks the user to enter the number repeatedly because we are not giving a break statement anywhere so it will work infinite times.

Python While with Assignment by calling the function

Now, we will see how to assign a variable inside a while loop in Python by calling the function name. We will create a user-defined function so you will understand how it is working.

How Python While with Assignment can be created by calling the user-defined function.

Python While with Assignment by calling the function

In the above code, we create a function called get_ascii_value() . This function takes a string as a parameter and returns the ASCII value of each character.

Then, we initialize a while loop, taking user_input as a string. The result variable calls a function and returns a character and ASCII value in a dictionary datatype.

In this Python article, you learned how to use Python while with assignment with different approaches and examples. We tried to cover all types of scenarios, such as creating user-defined functions , assigning within a while loop, taking user input, etc.

You may like to read:

  • Python Find Last Number in String
  • How to compare two lists in Python and return non-matches elements
  • How to Convert a Dict to String in Python[5 ways]

Bijay - Python Expert

I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile .

Datagy logo

  • Learn Python
  • Python Lists
  • Python Dictionaries
  • Python Strings
  • Python Functions
  • Learn Pandas & NumPy
  • Pandas Tutorials
  • Numpy Tutorials
  • Learn Data Visualization
  • Python Seaborn
  • Python Matplotlib

Python While Loop with Multiple Conditions

  • September 25, 2021 December 20, 2022

Python While Loop Multiple Conditions Cover Image

In this tutorial, you’ll learn how to write a Python while loop with multiple conditions, including and and or conditions. You’ll also learn how to use the NOT operator as well as how to group multiple conditions.

The Quick Answer: Embed Conditions with AND or OR Operators in Your While Loop

Quick Answer - Python While Loop Multiple Conditions

Table of Contents

What is a Python While Loop

A Python while loop is an example of iteration , meaning that some Python statement is executed a certain number of times or while a condition is true. A while loop is similar to a Python for loop, but it is executed different. A Python while loop is both an example of definite iteration , meaning that it iterates a definite number of times, and an example of indefinite iteration , meaning that it iterates an indefinite number of times.

Let’s take a quick look at how a while loop is written in Python:

In the example above, the while loop will complete the step do something indefinitely, until the condition is no longer met.

If, for example, we wrote:

The program would run indefinitely , until the condition is not longer True. Because of this, we need to be careful about executing a while loop.

To see how we can stop a while loop in Python, let’s take a look at the example below:

In the sections below, you'll learn more about how the Python while loop can be implemented with multiple conditions. Let's get started!

Want to learn about Python for-loops? Check out my in-depth tutorial here , to learn all you need to know to get started!

Python While Loop with Multiple Conditions Using AND

Now that you've had a quick recap of how to write a Python while loop, let's take a look at how we can write a while loop with multiple conditions using the AND keyword.

In this case, we want all of the conditions to be true, whether or not there are two, three, or more conditions to be met.

To accomplish meeting two conditions, we simply place the and keyword between each of the conditions. Let's take a look at what this looks like:

We can see here that the code iterates only while both of the conditions are true. As soon as, in this case, a = 4 , the condition of a < 4 is no longer true and the code stops execution.

Now let's take a look at how we can implement an or condition in a Python while loop.

Check out some other Python tutorials on datagy.io, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas !

Python While Loop with Multiple Conditions Using OR

Similar to using the and keyword in a Python while loop, we can also check if any of the conditions are true. For this, we use the or keyword, which checks whether either of our conditions are true.

In order to implement this, we simply place the or keyword in between the two conditions. We can also use more than two conditions and this would work in the same way.

For easier learning, let's stick to two conditions:

We can see that by simply switching from and to or , that or code execute many more times. In fact, the code runs until neither condition is not longer true .

Using a NOT Operator in a Python While Loop with Multiple Conditions

Another important and helpful operator to apply in Python while loops is the not operator. What this operator does is simply reverse the truth of a statement. For example, if we wrote not True , then it would evaluate to False . This can be immensely helpful when trying to write your code in a more plan language style.

Let's see how we can apply this in one of our examples:

Here our code checks that a is less than 4 and that b is not less than 3. Because of this, our code only executes here until a is equal to 4.

Next, let's take a look at how to group multiple conditions in a Python.

How to Group Multiple Conditions in a Python While Loop

There may be many times that you want to group multiple conditions, including mixing and and or statements. When you do this, it's important to understand the order in which these conditions execute. Anything placed in parentheses will evaluated against one another.

To better understand this, let's take a look at this example:

In the code above, if either a or b evaluate to True and c is True then the code will run.

This is known as a Python truth table and it's an important concept to understand.

In essence, the parentheses reduce the expression to a single truth that is checked against, simplifying the truth statement significantly.

Now, let's take a look at a practical, hands-on example to better understand this:

We can see here that the code stops after the third iteration. The reason for this a is less than 4 and b is greater than 3 after the third iteration. Because neither of the conditions in the parentheses are met, the code stops executing.

In this post, you learned how to use a Python while loop with multiple conditions. You learned how to use a Python while loop with both AND and OR conditions, as well as how to use the NOT operator. Finally, you learned how to group multiple conditions in a Python while loop.

To learn more about Python while loops, check out the official documentation here .

To learn more about related topics, check out the tutorials below:

  • Python New Line and How to Print Without Newline
  • Pandas Isin to Filter a Dataframe like SQL IN and NOT IN

Nik Piepenbreier

Nik is the author of datagy.io and has over a decade of experience working with data analytics, data science, and Python. He specializes in teaching developers how to use Python for data science using hands-on tutorials. View Author posts

1 thought on “Python While Loop with Multiple Conditions”

Pingback:  Python For Loop Tutorial - All You Need to Know! • datagy

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Learn Python practically and Get Certified .

Popular Tutorials

Popular examples, reference materials, learn python interactively, python introduction.

  • How to Get Started With Python?
  • Python Comments
  • Python Variables, Constants and Literals
  • Python Type Conversion
  • Python Basic Input and Output
  • Python Operators
  • Precedence and Associativity of Operators in Python (programiz.com)

Python Flow Control

  • Python if...else Statement

Python for Loop

Python while Loop

Python break and continue

  • Python pass Statement

Python Data types

  • Python Numbers, Type Conversion and Mathematics
  • Python List
  • Python Tuple
  • Python Sets
  • Python Dictionary
  • Python String
  • Python Functions
  • Python Function Arguments
  • Python Variable Scope
  • Python Global Keyword
  • Python Recursion
  • Python Modules
  • Python Package
  • Python Main function (programiz.com)

Python Files

  • Python Directory and Files Management
  • Python CSV: Read and Write CSV files (programiz.com)
  • Reading CSV files in Python (programiz.com)
  • Writing CSV files in Python (programiz.com)
  • Python Exception Handling
  • Python Exceptions
  • Python Custom Exceptions

Python Object & Class

  • Python Objects and Classes
  • Python Inheritance
  • Python Multiple Inheritance
  • Polymorphism in Python(with Examples) (programiz.com)
  • Python Operator Overloading

Python Advanced Topics

  • List comprehension
  • Python Lambda/Anonymous Function
  • Python Iterators
  • Python Generators
  • Python Namespace and Scope
  • Python Closures
  • Python Decorators
  • Python @property decorator
  • Python RegEx

Python Date and Time

  • Python datetime
  • Python strftime()
  • Python strptime()
  • How to get current date and time in Python?
  • Python Get Current Time
  • Python timestamp to datetime and vice-versa
  • Python time Module
  • Python sleep()

Additional Topic

  • Python Keywords and Identifiers
  • Python Asserts
  • Python Json
  • Python *args and **kwargs (With Examples) (programiz.com)

Python Tutorials

Python Looping Techniques

Python range() Function

In Python, we use the while loop to repeat a block of code until a certain condition is met. For example,

In the above example, we have used a while loop to print the numbers from 1 to 3 . The loop runs as long as the condition number <= 3 is satisfied.

  • while Loop Syntax
  • The while loop evaluates the condition.
  • If the condition is true, body of while loop is executed. The condition is evaluated again.
  • This process continues until the condition is False .
  • Once the condition evaluates to False , the loop terminates.
  • Flowchart of Python while Loop

Flowchart of Python while Loop

  • Example: Python while Loop

Here is how the above program works:

  • It asks the user to enter a number.
  • If the user enters a number other than 0 , it adds the number to the total and asks the user to enter a number again.
  • If the user enters 0 , the loop terminates and the program displays the total.
  • Infinite while Loop

If the condition of a while loop is always True , the loop runs for infinite times, forming an infinite while loop . For example,

The above program is equivalent to:

More on Python while Loop

We can use a break statement inside a while loop to terminate the loop immediately without checking the test condition. For example,

Here, the condition of the while loop is always True . However, if the user enters end , the loop termiantes because of the break statement.

Here, on the third iteration, the counter becomes 2 which terminates the loop. It then executes the else block and prints This is inside else block .

Note : The else block will not execute if the while loop is terminated by a break statement.

The for loop is usually used in the sequence when the number of iterations is known. For example,

The while loop is usually used when the number of iterations is unknown. For example,

Table of Contents

  • Introduction

Video: Python while Loop

Sorry about that.

Related Tutorials

Python Tutorial

Python Library

Python Tutorial

File handling, python modules, python numpy, python pandas, python matplotlib, python scipy, machine learning, python mysql, python mongodb, python reference, module reference, python how to, python examples, python while loops, python loops.

Python has two primitive loop commands:

  • while loops

The while Loop

With the while loop we can execute a set of statements as long as a condition is true.

Print i as long as i is less than 6:

Note: remember to increment i, or else the loop will continue forever.

The while loop requires relevant variables to be ready, in this example we need to define an indexing variable, i , which we set to 1.

The break Statement

With the break statement we can stop the loop even if the while condition is true:

Exit the loop when i is 3:

Advertisement

The continue Statement

With the continue statement we can stop the current iteration, and continue with the next:

Continue to the next iteration if i is 3:

The else Statement

With the else statement we can run a block of code once when the condition no longer is true:

Print a message once the condition is false:

Test Yourself With Exercises

Print i as long as i is less than 6.

Start the Exercise

Get Certified

COLOR PICKER

colorpicker

Contact Sales

If you want to use W3Schools services as an educational institution, team or enterprise, send us an e-mail: [email protected]

Report Error

If you want to report an error, or if you want to make a suggestion, send us an e-mail: [email protected]

Top Tutorials

Top references, top examples, get certified.

TutorialsTonight Logo

Python Conditional Assignment

When you want to assign a value to a variable based on some condition, like if the condition is true then assign a value to the variable, else assign some other value to the variable, then you can use the conditional assignment operator.

In this tutorial, we will look at different ways to assign values to a variable based on some condition.

1. Using Ternary Operator

The ternary operator is very special operator in Python, it is used to assign a value to a variable based on some condition.

It goes like this:

Here, the value of variable will be value_if_true if the condition is true, else it will be value_if_false .

Let's see a code snippet to understand it better.

You can see we have conditionally assigned a value to variable c based on the condition a > b .

2. Using if-else statement

if-else statements are the core part of any programming language, they are used to execute a block of code based on some condition.

Using an if-else statement, we can assign a value to a variable based on the condition we provide.

Here is an example of replacing the above code snippet with the if-else statement.

3. Using Logical Short Circuit Evaluation

Logical short circuit evaluation is another way using which you can assign a value to a variable conditionally.

The format of logical short circuit evaluation is:

It looks similar to ternary operator, but it is not. Here the condition and value_if_true performs logical AND operation, if both are true then the value of variable will be value_if_true , or else it will be value_if_false .

Let's see an example:

But if we make condition True but value_if_true False (or 0 or None), then the value of variable will be value_if_false .

So, you can see that the value of c is 20 even though the condition a < b is True .

So, you should be careful while using logical short circuit evaluation.

While working with lists , we often need to check if a list is empty or not, and if it is empty then we need to assign some default value to it.

Let's see how we can do it using conditional assignment.

Here, we have assigned a default value to my_list if it is empty.

Assign a value to a variable conditionally based on the presence of an element in a list.

Now you know 3 different ways to assign a value to a variable conditionally. Any of these methods can be used to assign a value when there is a condition.

The cleanest and fastest way to conditional value assignment is the ternary operator .

if-else statement is recommended to use when you have to execute a block of code based on some condition.

Happy coding! 😊

Pythonista Planet Logo

18 Python while Loop Examples and Exercises

In Python programming, we use while loops to do a task a certain number of times repeatedly. The while loop checks a condition and executes the task as long as that condition is satisfied. The loop will stop its execution once the condition becomes not satisfied.

The syntax of a while loop is as follows:

In this post, I have added some simple examples of using while loops in Python for various needs. Check out these examples to get a clear idea of how while loops work in Python. Let’s dive right in.

1. Example of using while loops in Python

python while with assignment

2. Example of using the break statement in while loops

In Python, we can use the  break  statement to end a while loop prematurely.

python while with assignment

3. Example of using the continue statement in while loops

In Python, we can use the  continue   statement to stop the current iteration of the while loop and continue with the next one. 

python while with assignment

4. Using if-elif-else statements inside while loop

python while with assignment

5. Adding elements to a list using while loop

python while with assignment

6. Python while loop to print a number series

python while with assignment

7. Printing the items in a tuple using while loop

python while with assignment

8. Finding the sum of numbers in a list using while loop

python while with assignment

9. Popping out elements from a list using while loop

python while with assignment

10. Printing all letters except some using Python while loop

python while with assignment

11. Python while loop to take inputs from the user

python while with assignment

12. Converting numbers from decimal to binary using while loop

python while with assignment

13. Finding the average of 5 numbers using while loop

python while with assignment

14. Printing the square of numbers using while loop

python while with assignment

15. Finding the multiples of a number using while loop

python while with assignment

16. Reversing a number using while loop in Python

python while with assignment

17. Finding the sum of even numbers using while loop

python while with assignment

18. Finding the factorial of a given number using while loop

python while with assignment

I hope this article was helpful. Check out my post on 21 Python for Loop Examples .

I'm the face behind Pythonista Planet. I learned my first programming language back in 2015. Ever since then, I've been learning programming and immersing myself in technology. On this site, I share everything that I've learned about computer programming.

16 thoughts on “ 18 Python while Loop Examples and Exercises ”

I am looking for a way to take user inputs in while loop & compare it & print the smallest/largest value. Can you help?

9.) Popping out elements from a list using while loop thank your us this kind of content for free appreciate 🙂 i was curious this # 9.)

fruitsList = [“Mango”,”Apple”,”Orange”,”Guava”]

while len(fruitsList) > 3: fruitsList.pop() print(fruitsList)

n=”e” numlist = [] while n != “” n= int(input (“Enter A Number: “)) numlist.append(n) print(numlist) print(max(numlist)) print (min(numlist))

Write a program that reads a value V, and then starts to read further values and adds them to a List until the initial value V is added again. Don’t add the first V and last V to the list. Print the list to the console.

Input Format

N numbers of lines input, with a random string

Output Format

Single line output, as a list.

Sample Input 1

56 23 346 457 234 436 689 68 80 25 567 56

Sample Output 1

[’23’, ‘346’, ‘457’, ‘234’, ‘436’, ‘689’, ’68’, ’80’, ’25’, ‘567’]

x = int(input(‘How many users will actually provide numerical values?’)) i = x internal_list = []

def main(List:list!=None):

print(f”The given list of numerical entities is formed by {internal_list}”) if i >= x: print(f”The maximal value contained by user’ list is equal with… {max(internal_list)}”) else: pass

x = 1 while i >= x: j: int = input(“Could you introduce your personal number?”) internal_list.append(j) x += 1 if i<=x: main(List=internal_list) break

You can append the numbers in the list and find the minimum or maximum.

i=0 newlist=[] #create an empty list while i<5: #for 5 values x=int(input(' Enter numbers')) i+=1 newlist.append(x) print(newlist) #you may skip this line print("The smallest number is", min(newlist))

The output will be: Enter numbers 3 Enter numbers 4 Enter numbers 8 Enter numbers 2 Enter numbers 9 [3,4,8,2,9] The smallest number is 2

My bro ,u are too much oo,u really open my eyes to many things about pythons ,which I did not know b4 .pls am a beginer to this course .I need yr help oo,to enable me know more sir. Thanks and God bless u

Printing the items in a tuple using while loop exercise shows wrong answer please review the answer.

I just checked again, and it is the correct answer. Can you check your code once again? Maybe you might have missed something in your code.

Hi, Ashwin Thanks for these informative and variative while loops examples; they are really helpful for practicing. 🙂

name = input(‘name’) strip= input(‘word’)

# Now i want to see both methods while and for to striped out the strip’s values from the name’s letters.

no output is coming in example 10.

lst = [] for i in range(5): num = int(input(“Enter your numbers: “)) lst += [num] print(“The greater number is”,max(lst)) print(“The smallest number is”,min(lst))

How I wish you were able to write comments on your code lines to help newbies better understand the concept.

Rony its true but what do you mean by write coments on your code

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name and email in this browser for the next time I comment.

Recent Posts

Introduction to Modular Programming with Flask

Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules. In this tutorial, let's understand what modular...

Introduction to ORM with Flask-SQLAlchemy

While Flask provides the essentials to get a web application up and running, it doesn't force anything upon the developer. This means that many features aren't included in the core framework....

python while with assignment

Python Enhancement Proposals

  • Python »
  • PEP Index »

PEP 572 – Assignment Expressions

The importance of real code, exceptional cases, scope of the target, relative precedence of :=, change to evaluation order, differences between assignment expressions and assignment statements, specification changes during implementation, _pydecimal.py, datetime.py, sysconfig.py, simplifying list comprehensions, capturing condition values, changing the scope rules for comprehensions, alternative spellings, special-casing conditional statements, special-casing comprehensions, lowering operator precedence, allowing commas to the right, always requiring parentheses, why not just turn existing assignment into an expression, with assignment expressions, why bother with assignment statements, why not use a sublocal scope and prevent namespace pollution, style guide recommendations, acknowledgements, a numeric example, appendix b: rough code translations for comprehensions, appendix c: no changes to scope semantics.

This is a proposal for creating a way to assign to variables within an expression using the notation NAME := expr .

As part of this change, there is also an update to dictionary comprehension evaluation order to ensure key expressions are executed before value expressions (allowing the key to be bound to a name and then re-used as part of calculating the corresponding value).

During discussion of this PEP, the operator became informally known as “the walrus operator”. The construct’s formal name is “Assignment Expressions” (as per the PEP title), but they may also be referred to as “Named Expressions” (e.g. the CPython reference implementation uses that name internally).

Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. Currently, this feature is available only in statement form, making it unavailable in list comprehensions and other expression contexts.

Additionally, naming sub-parts of a large expression can assist an interactive debugger, providing useful display hooks and partial results. Without a way to capture sub-expressions inline, this would require refactoring of the original code; with assignment expressions, this merely requires the insertion of a few name := markers. Removing the need to refactor reduces the likelihood that the code be inadvertently changed as part of debugging (a common cause of Heisenbugs), and is easier to dictate to another programmer.

During the development of this PEP many people (supporters and critics both) have had a tendency to focus on toy examples on the one hand, and on overly complex examples on the other.

The danger of toy examples is twofold: they are often too abstract to make anyone go “ooh, that’s compelling”, and they are easily refuted with “I would never write it that way anyway”.

The danger of overly complex examples is that they provide a convenient strawman for critics of the proposal to shoot down (“that’s obfuscated”).

Yet there is some use for both extremely simple and extremely complex examples: they are helpful to clarify the intended semantics. Therefore, there will be some of each below.

However, in order to be compelling , examples should be rooted in real code, i.e. code that was written without any thought of this PEP, as part of a useful application, however large or small. Tim Peters has been extremely helpful by going over his own personal code repository and picking examples of code he had written that (in his view) would have been clearer if rewritten with (sparing) use of assignment expressions. His conclusion: the current proposal would have allowed a modest but clear improvement in quite a few bits of code.

Another use of real code is to observe indirectly how much value programmers place on compactness. Guido van Rossum searched through a Dropbox code base and discovered some evidence that programmers value writing fewer lines over shorter lines.

Case in point: Guido found several examples where a programmer repeated a subexpression, slowing down the program, in order to save one line of code, e.g. instead of writing:

they would write:

Another example illustrates that programmers sometimes do more work to save an extra level of indentation:

This code tries to match pattern2 even if pattern1 has a match (in which case the match on pattern2 is never used). The more efficient rewrite would have been:

Syntax and semantics

In most contexts where arbitrary Python expressions can be used, a named expression can appear. This is of the form NAME := expr where expr is any valid Python expression other than an unparenthesized tuple, and NAME is an identifier.

The value of such a named expression is the same as the incorporated expression, with the additional side-effect that the target is assigned that value:

There are a few places where assignment expressions are not allowed, in order to avoid ambiguities or user confusion:

This rule is included to simplify the choice for the user between an assignment statement and an assignment expression – there is no syntactic position where both are valid.

Again, this rule is included to avoid two visually similar ways of saying the same thing.

This rule is included to disallow excessively confusing code, and because parsing keyword arguments is complex enough already.

This rule is included to discourage side effects in a position whose exact semantics are already confusing to many users (cf. the common style recommendation against mutable default values), and also to echo the similar prohibition in calls (the previous bullet).

The reasoning here is similar to the two previous cases; this ungrouped assortment of symbols and operators composed of : and = is hard to read correctly.

This allows lambda to always bind less tightly than := ; having a name binding at the top level inside a lambda function is unlikely to be of value, as there is no way to make use of it. In cases where the name will be used more than once, the expression is likely to need parenthesizing anyway, so this prohibition will rarely affect code.

This shows that what looks like an assignment operator in an f-string is not always an assignment operator. The f-string parser uses : to indicate formatting options. To preserve backwards compatibility, assignment operator usage inside of f-strings must be parenthesized. As noted above, this usage of the assignment operator is not recommended.

An assignment expression does not introduce a new scope. In most cases the scope in which the target will be bound is self-explanatory: it is the current scope. If this scope contains a nonlocal or global declaration for the target, the assignment expression honors that. A lambda (being an explicit, if anonymous, function definition) counts as a scope for this purpose.

There is one special case: an assignment expression occurring in a list, set or dict comprehension or in a generator expression (below collectively referred to as “comprehensions”) binds the target in the containing scope, honoring a nonlocal or global declaration for the target in that scope, if one exists. For the purpose of this rule the containing scope of a nested comprehension is the scope that contains the outermost comprehension. A lambda counts as a containing scope.

The motivation for this special case is twofold. First, it allows us to conveniently capture a “witness” for an any() expression, or a counterexample for all() , for example:

Second, it allows a compact way of updating mutable state from a comprehension, for example:

However, an assignment expression target name cannot be the same as a for -target name appearing in any comprehension containing the assignment expression. The latter names are local to the comprehension in which they appear, so it would be contradictory for a contained use of the same name to refer to the scope containing the outermost comprehension instead.

For example, [i := i+1 for i in range(5)] is invalid: the for i part establishes that i is local to the comprehension, but the i := part insists that i is not local to the comprehension. The same reason makes these examples invalid too:

While it’s technically possible to assign consistent semantics to these cases, it’s difficult to determine whether those semantics actually make sense in the absence of real use cases. Accordingly, the reference implementation [1] will ensure that such cases raise SyntaxError , rather than executing with implementation defined behaviour.

This restriction applies even if the assignment expression is never executed:

For the comprehension body (the part before the first “for” keyword) and the filter expression (the part after “if” and before any nested “for”), this restriction applies solely to target names that are also used as iteration variables in the comprehension. Lambda expressions appearing in these positions introduce a new explicit function scope, and hence may use assignment expressions with no additional restrictions.

Due to design constraints in the reference implementation (the symbol table analyser cannot easily detect when names are re-used between the leftmost comprehension iterable expression and the rest of the comprehension), named expressions are disallowed entirely as part of comprehension iterable expressions (the part after each “in”, and before any subsequent “if” or “for” keyword):

A further exception applies when an assignment expression occurs in a comprehension whose containing scope is a class scope. If the rules above were to result in the target being assigned in that class’s scope, the assignment expression is expressly invalid. This case also raises SyntaxError :

(The reason for the latter exception is the implicit function scope created for comprehensions – there is currently no runtime mechanism for a function to refer to a variable in the containing class scope, and we do not want to add such a mechanism. If this issue ever gets resolved this special case may be removed from the specification of assignment expressions. Note that the problem already exists for using a variable defined in the class scope from a comprehension.)

See Appendix B for some examples of how the rules for targets in comprehensions translate to equivalent code.

The := operator groups more tightly than a comma in all syntactic positions where it is legal, but less tightly than all other operators, including or , and , not , and conditional expressions ( A if C else B ). As follows from section “Exceptional cases” above, it is never allowed at the same level as = . In case a different grouping is desired, parentheses should be used.

The := operator may be used directly in a positional function call argument; however it is invalid directly in a keyword argument.

Some examples to clarify what’s technically valid or invalid:

Most of the “valid” examples above are not recommended, since human readers of Python source code who are quickly glancing at some code may miss the distinction. But simple cases are not objectionable:

This PEP recommends always putting spaces around := , similar to PEP 8 ’s recommendation for = when used for assignment, whereas the latter disallows spaces around = used for keyword arguments.)

In order to have precisely defined semantics, the proposal requires evaluation order to be well-defined. This is technically not a new requirement, as function calls may already have side effects. Python already has a rule that subexpressions are generally evaluated from left to right. However, assignment expressions make these side effects more visible, and we propose a single change to the current evaluation order:

  • In a dict comprehension {X: Y for ...} , Y is currently evaluated before X . We propose to change this so that X is evaluated before Y . (In a dict display like {X: Y} this is already the case, and also in dict((X, Y) for ...) which should clearly be equivalent to the dict comprehension.)

Most importantly, since := is an expression, it can be used in contexts where statements are illegal, including lambda functions and comprehensions.

Conversely, assignment expressions don’t support the advanced features found in assignment statements:

  • Multiple targets are not directly supported: x = y = z = 0 # Equivalent: (z := (y := (x := 0)))
  • Single assignment targets other than a single NAME are not supported: # No equivalent a [ i ] = x self . rest = []
  • Priority around commas is different: x = 1 , 2 # Sets x to (1, 2) ( x := 1 , 2 ) # Sets x to 1
  • Iterable packing and unpacking (both regular or extended forms) are not supported: # Equivalent needs extra parentheses loc = x , y # Use (loc := (x, y)) info = name , phone , * rest # Use (info := (name, phone, *rest)) # No equivalent px , py , pz = position name , phone , email , * other_info = contact
  • Inline type annotations are not supported: # Closest equivalent is "p: Optional[int]" as a separate declaration p : Optional [ int ] = None
  • Augmented assignment is not supported: total += tax # Equivalent: (total := total + tax)

The following changes have been made based on implementation experience and additional review after the PEP was first accepted and before Python 3.8 was released:

  • for consistency with other similar exceptions, and to avoid locking in an exception name that is not necessarily going to improve clarity for end users, the originally proposed TargetScopeError subclass of SyntaxError was dropped in favour of just raising SyntaxError directly. [3]
  • due to a limitation in CPython’s symbol table analysis process, the reference implementation raises SyntaxError for all uses of named expressions inside comprehension iterable expressions, rather than only raising them when the named expression target conflicts with one of the iteration variables in the comprehension. This could be revisited given sufficiently compelling examples, but the extra complexity needed to implement the more selective restriction doesn’t seem worthwhile for purely hypothetical use cases.

Examples from the Python standard library

env_base is only used on these lines, putting its assignment on the if moves it as the “header” of the block.

  • Current: env_base = os . environ . get ( "PYTHONUSERBASE" , None ) if env_base : return env_base
  • Improved: if env_base := os . environ . get ( "PYTHONUSERBASE" , None ): return env_base

Avoid nested if and remove one indentation level.

  • Current: if self . _is_special : ans = self . _check_nans ( context = context ) if ans : return ans
  • Improved: if self . _is_special and ( ans := self . _check_nans ( context = context )): return ans

Code looks more regular and avoid multiple nested if. (See Appendix A for the origin of this example.)

  • Current: reductor = dispatch_table . get ( cls ) if reductor : rv = reductor ( x ) else : reductor = getattr ( x , "__reduce_ex__" , None ) if reductor : rv = reductor ( 4 ) else : reductor = getattr ( x , "__reduce__" , None ) if reductor : rv = reductor () else : raise Error ( "un(deep)copyable object of type %s " % cls )
  • Improved: if reductor := dispatch_table . get ( cls ): rv = reductor ( x ) elif reductor := getattr ( x , "__reduce_ex__" , None ): rv = reductor ( 4 ) elif reductor := getattr ( x , "__reduce__" , None ): rv = reductor () else : raise Error ( "un(deep)copyable object of type %s " % cls )

tz is only used for s += tz , moving its assignment inside the if helps to show its scope.

  • Current: s = _format_time ( self . _hour , self . _minute , self . _second , self . _microsecond , timespec ) tz = self . _tzstr () if tz : s += tz return s
  • Improved: s = _format_time ( self . _hour , self . _minute , self . _second , self . _microsecond , timespec ) if tz := self . _tzstr (): s += tz return s

Calling fp.readline() in the while condition and calling .match() on the if lines make the code more compact without making it harder to understand.

  • Current: while True : line = fp . readline () if not line : break m = define_rx . match ( line ) if m : n , v = m . group ( 1 , 2 ) try : v = int ( v ) except ValueError : pass vars [ n ] = v else : m = undef_rx . match ( line ) if m : vars [ m . group ( 1 )] = 0
  • Improved: while line := fp . readline (): if m := define_rx . match ( line ): n , v = m . group ( 1 , 2 ) try : v = int ( v ) except ValueError : pass vars [ n ] = v elif m := undef_rx . match ( line ): vars [ m . group ( 1 )] = 0

A list comprehension can map and filter efficiently by capturing the condition:

Similarly, a subexpression can be reused within the main expression, by giving it a name on first use:

Note that in both cases the variable y is bound in the containing scope (i.e. at the same level as results or stuff ).

Assignment expressions can be used to good effect in the header of an if or while statement:

Particularly with the while loop, this can remove the need to have an infinite loop, an assignment, and a condition. It also creates a smooth parallel between a loop which simply uses a function call as its condition, and one which uses that as its condition but also uses the actual value.

An example from the low-level UNIX world:

Rejected alternative proposals

Proposals broadly similar to this one have come up frequently on python-ideas. Below are a number of alternative syntaxes, some of them specific to comprehensions, which have been rejected in favour of the one given above.

A previous version of this PEP proposed subtle changes to the scope rules for comprehensions, to make them more usable in class scope and to unify the scope of the “outermost iterable” and the rest of the comprehension. However, this part of the proposal would have caused backwards incompatibilities, and has been withdrawn so the PEP can focus on assignment expressions.

Broadly the same semantics as the current proposal, but spelled differently.

Since EXPR as NAME already has meaning in import , except and with statements (with different semantics), this would create unnecessary confusion or require special-casing (e.g. to forbid assignment within the headers of these statements).

(Note that with EXPR as VAR does not simply assign the value of EXPR to VAR – it calls EXPR.__enter__() and assigns the result of that to VAR .)

Additional reasons to prefer := over this spelling include:

  • In if f(x) as y the assignment target doesn’t jump out at you – it just reads like if f x blah blah and it is too similar visually to if f(x) and y .
  • import foo as bar
  • except Exc as var
  • with ctxmgr() as var

To the contrary, the assignment expression does not belong to the if or while that starts the line, and we intentionally allow assignment expressions in other contexts as well.

  • NAME = EXPR
  • if NAME := EXPR

reinforces the visual recognition of assignment expressions.

This syntax is inspired by languages such as R and Haskell, and some programmable calculators. (Note that a left-facing arrow y <- f(x) is not possible in Python, as it would be interpreted as less-than and unary minus.) This syntax has a slight advantage over ‘as’ in that it does not conflict with with , except and import , but otherwise is equivalent. But it is entirely unrelated to Python’s other use of -> (function return type annotations), and compared to := (which dates back to Algol-58) it has a much weaker tradition.

This has the advantage that leaked usage can be readily detected, removing some forms of syntactic ambiguity. However, this would be the only place in Python where a variable’s scope is encoded into its name, making refactoring harder.

Execution order is inverted (the indented body is performed first, followed by the “header”). This requires a new keyword, unless an existing keyword is repurposed (most likely with: ). See PEP 3150 for prior discussion on this subject (with the proposed keyword being given: ).

This syntax has fewer conflicts than as does (conflicting only with the raise Exc from Exc notation), but is otherwise comparable to it. Instead of paralleling with expr as target: (which can be useful but can also be confusing), this has no parallels, but is evocative.

One of the most popular use-cases is if and while statements. Instead of a more general solution, this proposal enhances the syntax of these two statements to add a means of capturing the compared value:

This works beautifully if and ONLY if the desired condition is based on the truthiness of the captured value. It is thus effective for specific use-cases (regex matches, socket reads that return '' when done), and completely useless in more complicated cases (e.g. where the condition is f(x) < 0 and you want to capture the value of f(x) ). It also has no benefit to list comprehensions.

Advantages: No syntactic ambiguities. Disadvantages: Answers only a fraction of possible use-cases, even in if / while statements.

Another common use-case is comprehensions (list/set/dict, and genexps). As above, proposals have been made for comprehension-specific solutions.

This brings the subexpression to a location in between the ‘for’ loop and the expression. It introduces an additional language keyword, which creates conflicts. Of the three, where reads the most cleanly, but also has the greatest potential for conflict (e.g. SQLAlchemy and numpy have where methods, as does tkinter.dnd.Icon in the standard library).

As above, but reusing the with keyword. Doesn’t read too badly, and needs no additional language keyword. Is restricted to comprehensions, though, and cannot as easily be transformed into “longhand” for-loop syntax. Has the C problem that an equals sign in an expression can now create a name binding, rather than performing a comparison. Would raise the question of why “with NAME = EXPR:” cannot be used as a statement on its own.

As per option 2, but using as rather than an equals sign. Aligns syntactically with other uses of as for name binding, but a simple transformation to for-loop longhand would create drastically different semantics; the meaning of with inside a comprehension would be completely different from the meaning as a stand-alone statement, while retaining identical syntax.

Regardless of the spelling chosen, this introduces a stark difference between comprehensions and the equivalent unrolled long-hand form of the loop. It is no longer possible to unwrap the loop into statement form without reworking any name bindings. The only keyword that can be repurposed to this task is with , thus giving it sneakily different semantics in a comprehension than in a statement; alternatively, a new keyword is needed, with all the costs therein.

There are two logical precedences for the := operator. Either it should bind as loosely as possible, as does statement-assignment; or it should bind more tightly than comparison operators. Placing its precedence between the comparison and arithmetic operators (to be precise: just lower than bitwise OR) allows most uses inside while and if conditions to be spelled without parentheses, as it is most likely that you wish to capture the value of something, then perform a comparison on it:

Once find() returns -1, the loop terminates. If := binds as loosely as = does, this would capture the result of the comparison (generally either True or False ), which is less useful.

While this behaviour would be convenient in many situations, it is also harder to explain than “the := operator behaves just like the assignment statement”, and as such, the precedence for := has been made as close as possible to that of = (with the exception that it binds tighter than comma).

Some critics have claimed that the assignment expressions should allow unparenthesized tuples on the right, so that these two would be equivalent:

(With the current version of the proposal, the latter would be equivalent to ((point := x), y) .)

However, adopting this stance would logically lead to the conclusion that when used in a function call, assignment expressions also bind less tight than comma, so we’d have the following confusing equivalence:

The less confusing option is to make := bind more tightly than comma.

It’s been proposed to just always require parentheses around an assignment expression. This would resolve many ambiguities, and indeed parentheses will frequently be needed to extract the desired subexpression. But in the following cases the extra parentheses feel redundant:

Frequently Raised Objections

C and its derivatives define the = operator as an expression, rather than a statement as is Python’s way. This allows assignments in more contexts, including contexts where comparisons are more common. The syntactic similarity between if (x == y) and if (x = y) belies their drastically different semantics. Thus this proposal uses := to clarify the distinction.

The two forms have different flexibilities. The := operator can be used inside a larger expression; the = statement can be augmented to += and its friends, can be chained, and can assign to attributes and subscripts.

Previous revisions of this proposal involved sublocal scope (restricted to a single statement), preventing name leakage and namespace pollution. While a definite advantage in a number of situations, this increases complexity in many others, and the costs are not justified by the benefits. In the interests of language simplicity, the name bindings created here are exactly equivalent to any other name bindings, including that usage at class or module scope will create externally-visible names. This is no different from for loops or other constructs, and can be solved the same way: del the name once it is no longer needed, or prefix it with an underscore.

(The author wishes to thank Guido van Rossum and Christoph Groth for their suggestions to move the proposal in this direction. [2] )

As expression assignments can sometimes be used equivalently to statement assignments, the question of which should be preferred will arise. For the benefit of style guides such as PEP 8 , two recommendations are suggested.

  • If either assignment statements or assignment expressions can be used, prefer statements; they are a clear declaration of intent.
  • If using assignment expressions would lead to ambiguity about execution order, restructure it to use statements instead.

The authors wish to thank Alyssa Coghlan and Steven D’Aprano for their considerable contributions to this proposal, and members of the core-mentorship mailing list for assistance with implementation.

Appendix A: Tim Peters’s findings

Here’s a brief essay Tim Peters wrote on the topic.

I dislike “busy” lines of code, and also dislike putting conceptually unrelated logic on a single line. So, for example, instead of:

instead. So I suspected I’d find few places I’d want to use assignment expressions. I didn’t even consider them for lines already stretching halfway across the screen. In other cases, “unrelated” ruled:

is a vast improvement over the briefer:

The original two statements are doing entirely different conceptual things, and slamming them together is conceptually insane.

In other cases, combining related logic made it harder to understand, such as rewriting:

as the briefer:

The while test there is too subtle, crucially relying on strict left-to-right evaluation in a non-short-circuiting or method-chaining context. My brain isn’t wired that way.

But cases like that were rare. Name binding is very frequent, and “sparse is better than dense” does not mean “almost empty is better than sparse”. For example, I have many functions that return None or 0 to communicate “I have nothing useful to return in this case, but since that’s expected often I’m not going to annoy you with an exception”. This is essentially the same as regular expression search functions returning None when there is no match. So there was lots of code of the form:

I find that clearer, and certainly a bit less typing and pattern-matching reading, as:

It’s also nice to trade away a small amount of horizontal whitespace to get another _line_ of surrounding code on screen. I didn’t give much weight to this at first, but it was so very frequent it added up, and I soon enough became annoyed that I couldn’t actually run the briefer code. That surprised me!

There are other cases where assignment expressions really shine. Rather than pick another from my code, Kirill Balunov gave a lovely example from the standard library’s copy() function in copy.py :

The ever-increasing indentation is semantically misleading: the logic is conceptually flat, “the first test that succeeds wins”:

Using easy assignment expressions allows the visual structure of the code to emphasize the conceptual flatness of the logic; ever-increasing indentation obscured it.

A smaller example from my code delighted me, both allowing to put inherently related logic in a single line, and allowing to remove an annoying “artificial” indentation level:

That if is about as long as I want my lines to get, but remains easy to follow.

So, in all, in most lines binding a name, I wouldn’t use assignment expressions, but because that construct is so very frequent, that leaves many places I would. In most of the latter, I found a small win that adds up due to how often it occurs, and in the rest I found a moderate to major win. I’d certainly use it more often than ternary if , but significantly less often than augmented assignment.

I have another example that quite impressed me at the time.

Where all variables are positive integers, and a is at least as large as the n’th root of x, this algorithm returns the floor of the n’th root of x (and roughly doubling the number of accurate bits per iteration):

It’s not obvious why that works, but is no more obvious in the “loop and a half” form. It’s hard to prove correctness without building on the right insight (the “arithmetic mean - geometric mean inequality”), and knowing some non-trivial things about how nested floor functions behave. That is, the challenges are in the math, not really in the coding.

If you do know all that, then the assignment-expression form is easily read as “while the current guess is too large, get a smaller guess”, where the “too large?” test and the new guess share an expensive sub-expression.

To my eyes, the original form is harder to understand:

This appendix attempts to clarify (though not specify) the rules when a target occurs in a comprehension or in a generator expression. For a number of illustrative examples we show the original code, containing a comprehension, and the translation, where the comprehension has been replaced by an equivalent generator function plus some scaffolding.

Since [x for ...] is equivalent to list(x for ...) these examples all use list comprehensions without loss of generality. And since these examples are meant to clarify edge cases of the rules, they aren’t trying to look like real code.

Note: comprehensions are already implemented via synthesizing nested generator functions like those in this appendix. The new part is adding appropriate declarations to establish the intended scope of assignment expression targets (the same scope they resolve to as if the assignment were performed in the block containing the outermost comprehension). For type inference purposes, these illustrative expansions do not imply that assignment expression targets are always Optional (but they do indicate the target binding scope).

Let’s start with a reminder of what code is generated for a generator expression without assignment expression.

  • Original code (EXPR usually references VAR): def f (): a = [ EXPR for VAR in ITERABLE ]
  • Translation (let’s not worry about name conflicts): def f (): def genexpr ( iterator ): for VAR in iterator : yield EXPR a = list ( genexpr ( iter ( ITERABLE )))

Let’s add a simple assignment expression.

  • Original code: def f (): a = [ TARGET := EXPR for VAR in ITERABLE ]
  • Translation: def f (): if False : TARGET = None # Dead code to ensure TARGET is a local variable def genexpr ( iterator ): nonlocal TARGET for VAR in iterator : TARGET = EXPR yield TARGET a = list ( genexpr ( iter ( ITERABLE )))

Let’s add a global TARGET declaration in f() .

  • Original code: def f (): global TARGET a = [ TARGET := EXPR for VAR in ITERABLE ]
  • Translation: def f (): global TARGET def genexpr ( iterator ): global TARGET for VAR in iterator : TARGET = EXPR yield TARGET a = list ( genexpr ( iter ( ITERABLE )))

Or instead let’s add a nonlocal TARGET declaration in f() .

  • Original code: def g (): TARGET = ... def f (): nonlocal TARGET a = [ TARGET := EXPR for VAR in ITERABLE ]
  • Translation: def g (): TARGET = ... def f (): nonlocal TARGET def genexpr ( iterator ): nonlocal TARGET for VAR in iterator : TARGET = EXPR yield TARGET a = list ( genexpr ( iter ( ITERABLE )))

Finally, let’s nest two comprehensions.

  • Original code: def f (): a = [[ TARGET := i for i in range ( 3 )] for j in range ( 2 )] # I.e., a = [[0, 1, 2], [0, 1, 2]] print ( TARGET ) # prints 2
  • Translation: def f (): if False : TARGET = None def outer_genexpr ( outer_iterator ): nonlocal TARGET def inner_generator ( inner_iterator ): nonlocal TARGET for i in inner_iterator : TARGET = i yield i for j in outer_iterator : yield list ( inner_generator ( range ( 3 ))) a = list ( outer_genexpr ( range ( 2 ))) print ( TARGET )

Because it has been a point of confusion, note that nothing about Python’s scoping semantics is changed. Function-local scopes continue to be resolved at compile time, and to have indefinite temporal extent at run time (“full closures”). Example:

This document has been placed in the public domain.

Source: https://github.com/python/peps/blob/main/peps/pep-0572.rst

Last modified: 2023-10-11 12:05:51 GMT

15 Python while Loop Exercises with Solutions for Beginners

python-while-loop-exercises-with-solutions-for-beginners

While loops are a fundamental control structure in Python (and most programming languages), and they are used whenever you need to execute a block of code repeatedly as long as a certain condition is true. In this article, I will give you 15 Python while loop exercises for beginners which will cover a wide range of programming scenarios.

Course for You: Learn Python in 100 days of coding

How do you practice while loop in Python?

The basic syntax or structure of a while loop looks like below:

In this basic while loop structure, you need to provide your condition and statement to accomplish your desired task in Python. In this article, I will list down some simple and basic while loop exercises practice programs for beginners with solutions, that will help you practice and gain confidence in Python.

Exercise 1: Print Numbers 1 to 10

This exercise uses a while loop to print numbers from 1 to 10 sequentially.

Exercise 2: Calculate the Sum of Numbers 1 to 100

This exercise calculates the sum of numbers from 1 to 100 using a while loop in Python.

Exercise 3: Count Down from 10 to 1

Opposite to Exercise 1 , this exercise counts down from 10 to 1 and prints each number in the countdown.

Exercise 4: Find the Factorial of a Number

This Python exercise calculates the factorial of a user-input number using a while loop. If you know more details about this exercise then read this article: 15 Simple Python Programs for Practice with Solutions .

You noticed I used input() function. This function will allow you to take input from the user in the Python prompt. Input from this input() function is always a string. So you need to convert it to an integer with int() function.

Exercise 5: Check if a Number is Prime

This exercise checks if a user-input number is prime or not by using a while loop and testing divisibility.

In this Python script, you noticed that I used if-elif-else statements inside the while loop. I also used break statement inside this while loop. This code can be helpful excesise to understand how to use if-elif-else statements and break statement inside a while loop in Python.

Exercise 6: Print Fibonacci Series

This exercise Python program generates and prints the Fibonacci series up to a specified number of terms using a while loop.

Exercise 7: Reverse a Number

This exercise Python program reverses a user-input number and prints the reversed number.

Exercise 8: Calculate the Average of Numbers

This while loop exercises calculates the average of a set of user-input numbers in Python.

Exercise 9: Reverse a String

This exercise Python program reverses a user-input string and prints the reversed string.

Exercise 10: Print Even Numbers

This exercise Python practice program prints even numbers from 2 to 20 using a while loop.

Exercise 11: Calculate the Sum of Digits in a Number

This Python practice exercise program calculates the sum of the digits in a user-input number using a while loop.

Exercise 12: Print a Triangle of Stars

This special Python exercise code prints a triangle pattern of stars with a user-specified number of rows using a while loop.

Exercise 13: Break and Continue statement in while loops

Break statement:.

The break statement is used to exit a loop before its normal termination condition is met. It is often used to terminate a loop when a specific condition is satisfied.

In this code, we have a while loop that runs indefinitely ( while True ). Inside the loop, we continuously ask the user for input using input() function. We check if the user input is equal to ‘q ‘. If it is, we use the break statement to exit the loop immediately. If the user doesn’t enter ‘q’ , we print the input.

Continue Statement:

The continue statement is used to skip the current iteration of a loop and move to the next iteration. It’s handy when you want to skip some specific values or conditions within a loop.

In this code, we have a while loop that iterates from 1 to 10 ( while num <= 10 ). Inside the loop, we check if the current value of num is even (using num % 2 == 0 ).

If num is even, we use the continue statement to skip the rest of the loop body and move to the next iteration. If num is odd, we print its value and increment num .

Exercise 14: Adding elements to a list using while loop

You can add or append a list element using while loop in Python. In the below code, we are appending an empty list with random numbers. The code will stop when the length of the list is 5 ( num_elements ).

Exercise 15: Removing elements from a list using while loop

Opposite to the above exercise, in this practice Python code, we will remove some elements from a list using while loop. In the below Python code, list elements greater than threshold value (20) will be removed from the my_list .

In this article, I listed down some example Python program exercises to practice while loop. These Python while loop exercises for beginners cover a wide range of programming scenarios which will give you confidence while writing code in Python.

This is it for this tutorial. If you have any comments or suggestions regarding this article, please please drop a comment below. If you want to learn Python quickly then this Udemy course is for you: Learn Python in 100 days of coding . If you are a person who loves learning from books then this article is for you: 5 Best Book for Learning Python .

Similar Read:

  • 15 Simple Python Programs for Practice with Solutions
  • 14 Python Exercises for Intermediate with Solutions
  • 12 Python Object Oriented Programming (OOP) Exercises
  • 19 Python Programming Lists Practice Exercises
  • 12 Python if else Exercises for Beginners
  • 11 Basic lambda Function Practice Exercises in Python
  • 14 Simple for Loop Exercises for Beginners in Python
  • 12 Python Dictionary Practice Exercises for Beginners

Anindya Naskar

Hi there, I’m Anindya Naskar, Data Science Engineer. I created this website to show you what I believe is the best possible way to get your start in the field of Data Science.

Related Posts

  • Social distancing detector using OpenCV and Python
  • 5 Best Book for Learning Python
  • Display Excel like Table using Tkinter – Python
  • Install TensorFlow GPU with Jupiter notebook for Windows
  • Upload and Display CSV file in html table Flask Python
  • Calculate Age from Date of Birth using Excel
  • How to Become a Python Backend Developer
  • Google Genesis: How AI is Changing News Reporting

Leave a comment Cancel reply

Save my name, email, and website in this browser for the next time I comment.

  • Free Python 3 Tutorial
  • Control Flow
  • Exception Handling
  • Python Programs
  • Python Projects
  • Python Interview Questions
  • Python Database
  • Data Science With Python
  • Machine Learning with Python
  • Logical Operators in Python with Examples
  • How To Do Math in Python 3 with Operators?
  • Python 3 - Logical Operators
  • Understanding Boolean Logic in Python 3
  • Concatenate two strings using Operator Overloading in Python
  • Relational Operators in Python
  • Difference between "__eq__" VS "is" VS "==" in Python
  • Modulo operator (%) in Python
  • Python Bitwise Operators
  • Python - Star or Asterisk operator ( * )
  • New '=' Operator in Python3.8 f-string
  • Format a Number Width in Python
  • Difference between != and is not operator in Python
  • Operator Overloading in Python
  • Python Object Comparison : "is" vs "=="
  • Python | a += b is not always a = a + b
  • Python Arithmetic Operators
  • Python Operators
  • Python | Operator.countOf

Assignment Operators in Python

Operators are used to perform operations on values and variables. These are the special symbols that carry out arithmetic, logical, bitwise computations. The value the operator operates on is known as Operand .

Here, we will cover Assignment Operators in Python. So, Assignment Operators are used to assigning values to variables. 

Now Let’s see each Assignment Operator one by one.

1) Assign: This operator is used to assign the value of the right side of the expression to the left side operand.

2) Add and Assign: This operator is used to add the right side operand with the left side operand and then assigning the result to the left operand.

Syntax: 

3) Subtract and Assign: This operator is used to subtract the right operand from the left operand and then assigning the result to the left operand.

Example –

 4) Multiply and Assign: This operator is used to multiply the right operand with the left operand and then assigning the result to the left operand.

 5) Divide and Assign: This operator is used to divide the left operand with the right operand and then assigning the result to the left operand.

 6) Modulus and Assign: This operator is used to take the modulus using the left and the right operands and then assigning the result to the left operand.

7) Divide (floor) and Assign: This operator is used to divide the left operand with the right operand and then assigning the result(floor) to the left operand.

 8) Exponent and Assign: This operator is used to calculate the exponent(raise power) value using operands and then assigning the result to the left operand.

9) Bitwise AND and Assign: This operator is used to perform Bitwise AND on both operands and then assigning the result to the left operand.

10) Bitwise OR and Assign: This operator is used to perform Bitwise OR on the operands and then assigning result to the left operand.

11) Bitwise XOR and Assign:  This operator is used to perform Bitwise XOR on the operands and then assigning result to the left operand.

12) Bitwise Right Shift and Assign: This operator is used to perform Bitwise right shift on the operands and then assigning result to the left operand.

 13) Bitwise Left Shift and Assign:  This operator is used to perform Bitwise left shift on the operands and then assigning result to the left operand.

Please Login to comment...

Similar reads.

author

  • Python-Operators
  • What are Tiktok AI Avatars?
  • Poe Introduces A Price-per-message Revenue Model For AI Bot Creators
  • Truecaller For Web Now Available For Android Users In India
  • Google Introduces New AI-powered Vids App
  • 30 OOPs Interview Questions and Answers (2024)

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

  • Open access
  • Published: 15 April 2024

Demuxafy : improvement in droplet assignment by integrating multiple single-cell demultiplexing and doublet detection methods

  • Drew Neavin 1 ,
  • Anne Senabouth 1 ,
  • Himanshi Arora 1 , 2 ,
  • Jimmy Tsz Hang Lee 3 ,
  • Aida Ripoll-Cladellas 4 ,
  • sc-eQTLGen Consortium ,
  • Lude Franke 5 ,
  • Shyam Prabhakar 6 , 7 , 8 ,
  • Chun Jimmie Ye 9 , 10 , 11 , 12 ,
  • Davis J. McCarthy 13 , 14 ,
  • Marta Melé 4 ,
  • Martin Hemberg 15 &
  • Joseph E. Powell   ORCID: orcid.org/0000-0002-5070-4124 1 , 16  

Genome Biology volume  25 , Article number:  94 ( 2024 ) Cite this article

Metrics details

Recent innovations in single-cell RNA-sequencing (scRNA-seq) provide the technology to investigate biological questions at cellular resolution. Pooling cells from multiple individuals has become a common strategy, and droplets can subsequently be assigned to a specific individual by leveraging their inherent genetic differences. An implicit challenge with scRNA-seq is the occurrence of doublets—droplets containing two or more cells. We develop Demuxafy, a framework to enhance donor assignment and doublet removal through the consensus intersection of multiple demultiplexing and doublet detecting methods. Demuxafy significantly improves droplet assignment by separating singlets from doublets and classifying the correct individual.

Droplet-based single-cell RNA sequencing (scRNA-seq) technologies have provided the tools to profile tens of thousands of single-cell transcriptomes simultaneously [ 1 ]. With these technological advances, combining cells from multiple samples in a single capture is common, increasing the sample size while simultaneously reducing batch effects, cost, and time. In addition, following cell capture and sequencing, the droplets can be demultiplexed—each droplet accurately assigned to each individual in the pool [ 2 , 3 , 4 , 5 , 6 , 7 ].

Many scRNA-seq experiments now capture upwards of 20,000 droplets, resulting in ~16% (3,200) doublets [ 8 ]. Current demultiplexing methods can also identify doublets—droplets containing two or more cells—from different individuals (heterogenic doublets). These doublets can significantly alter scientific conclusions if they are not effectively removed. Therefore, it is essential to remove doublets from droplet-based single-cell captures.

However, demultiplexing methods cannot identify droplets containing multiple cells from the same individual (homogenic doublets) and, therefore, cannot identify all doublets in a single capture. If left in the dataset, those doublets could appear as transitional cells between two distinct cell types or a completely new cell type. Accordingly, additional methods have been developed to identify heterotypic doublets (droplets that contain two cells from different cell types) by comparing the transcriptional profile of each droplet to doublets simulated from the dataset [ 9 , 10 , 11 , 12 , 13 , 14 , 15 ]. It is important to recognise that demultiplexing methods achieve two functions—segregation of cells from different donors and separation of singlets from doublets—while doublet detecting methods solely classify singlets versus doublets.

Therefore, demultiplexing and transcription-based doublet detecting methods provide complementary information to improve doublet detection, providing a cleaner dataset and more robust scientific results. There are currently five genetic-based demultiplexing [ 2 , 3 , 4 , 5 , 6 , 7 , 16 ] and seven transcription-based doublet-detecting methods implemented in various languages [ 9 , 10 , 11 , 12 , 13 , 14 , 15 ]. Under different scenarios, each method is subject to varying performance and, in some instances, biases in their ability to accurately assign cells or detect doublets from certain conditions. The best combination of methods is currently unclear but will undoubtedly depend on the dataset and research question.

Therefore, we set out to identify the best combination of genetic-based demultiplexing and transcription-based doublet-detecting methods to remove doublets and partition singlets from different donors correctly. In addition, we have developed a software platform ( Demuxafy ) that performs these intersectional methods and provides additional commands to simplify the execution and interpretation of results for each method (Fig. 1 a).

figure 1

Study design and qualitative method classifications. a  Demuxafy is a platform to perform demultiplexing and doublet detecting with consistent documentation. Demuxafy also provides wrapper scripts to quickly summarize the results from each method and assign clusters to each individual with reference genotypes when a reference-free demultiplexing method is used. Finally, Demuxafy provides a script to easily combine the results from multiple different methods into a single data frame and it provides a final assignment for each droplet based on the combination of multiple methods. In addition, Demuxafy provides summaries of the number of droplets classified as singlets or doublets by each method and a summary of the number of droplets assigned to each individual by each of the demultiplexing methods. b  Two datasets are included in this analysis - a PBMC dataset and a fibroblast dataset. The PBMC dataset contains 74 pools that captured approximately 20,000 droplets each with 12-16 donor cells multiplexed per pool. The fibroblast dataset contains 11 pools of roughly 7,000 droplets per pool with sizes ranging from six to eight donors per pool. All pools were processed by all demultiplexing and doublet detecting methods and the droplet and donor classifications were compared between the methods and between the PBMCs and fibroblasts. Then the PBMC droplets that were classified as singlets by all methods were taken as ‘true singlets’ and used to generate new pools in silico. Those pools were then processed by each of the demultiplexing and doublet detecting methods and intersectional combinations of demultiplexing and doublet detecting methods were tested for different experimental designs

To compare the demultiplexing and doublet detecting methods, we utilised two large, multiplexed datasets—one that contained ~1.4 million peripheral blood mononuclear cells (PBMCs) from 1,034 donors [ 17 ] and one with ~94,000 fibroblasts from 81 donors [ 18 ]. We used the true singlets from the PBMC dataset to generate new in silico pools to assess the performance of each method and the multi-method intersectional combinations (Fig. 1 b).

Here, we compare 14 demultiplexing and doublet detecting methods with different methodological approaches, capabilities, and intersectional combinations. Seven of those are demultiplexing methods ( Demuxalot [ 6 ], Demuxlet [ 3 ], Dropulation [ 5 ], Freemuxlet [ 16 ], ScSplit [ 7 ], Souporcell [ 4 ], and Vireo [ 2 ]) which leverage the common genetic variation between individuals to identify cells that came from each individual and to identify heterogenic doublets. The seven remaining methods ( DoubletDecon [ 9 ], DoubletDetection [ 14 ], DoubletFinder [ 10 ], ScDblFinder [ 11 ], Scds [ 12 ], Scrublet [ 13 ], and Solo [ 15 ]) identify doublets based on their similarity to simulated doublets generated by adding the transcriptional profiles of two randomly selected droplets in the dataset. These methods assume that the proportion of real doublets in the dataset is low, so combining any two droplets will likely represent the combination of two singlets.

We identify critical differences in the performance of demultiplexing and doublet detecting methods to classify droplets correctly. In the case of the demultiplexing techniques, their performance depends on their ability to identify singlets from doublets and assign a singlet to the correct individual. For doublet detecting methods, the performance is based solely on their ability to differentiate a singlet from a doublet. We identify limitations in identifying specific doublet types and cell types by some methods. In addition, we compare the intersectional combinations of these methods for multiple experimental designs and demonstrate that intersectional approaches significantly outperform all individual techniques. Thus, the intersectional methods provide enhanced singlet classification and doublet removal—a critical but often under-valued step of droplet-based scRNA-seq processing. Our results demonstrate that intersectional combinations of demultiplexing and doublet detecting software provide significant advantages in droplet-based scRNA-seq preprocessing that can alter results and conclusions drawn from the data. Finally, to provide easy implementation of our intersectional approach, we provide Demuxafy ( https://demultiplexing-doublet-detecting-docs.readthedocs.io/en/latest/index.html ) a complete platform to perform demultiplexing and doublet detecting intersectional methods (Fig. 1 a).

Study design

To evaluate demultiplexing and doublet detecting methods, we developed an experimental design that applies the different techniques to empirical pools and pools generated in silico from the combination of true singlets—droplets identified as singlets by every method (Fig. 1 a). For the first phase of this study, we used two empirical multiplexed datasets—the peripheral blood mononuclear cell (PBMC) dataset containing ~1.4 million cells from 1034 donors and a fibroblast dataset of ~94,000 cells from 81 individuals (Additional file 1 : Table S1). We chose these two cell systems to assess the methods in heterogeneous (PBMC) and homogeneous (fibroblast) cell types.

Demultiplexing and doublet detecting methods perform similarly for heterogeneous and homogeneous cell types

We applied the demultiplexing methods ( Demuxalot , Demuxlet , Dropulation , Freemuxlet , ScSplit , Souporcell , and Vireo ) and doublet detecting methods ( DoubletDecon , DoubletDetection , DoubletFinder , ScDblFinder , Scds , Scrublet , and Solo ) to the two datasets and assessed the results from each method. We first compared the droplet assignments by identifying the number of singlets and doublets identified by a given method that were consistently annotated by all methods (Fig. 2 a–d). We also identified the percentage of droplets that were annotated consistently between pairs of methods (Additional file 2 : Fig S1). In the cases where two demultiplexing methods were compared to one another, both the droplet type (singlet or doublet) and the assignment of the droplet to an individual had to match to be considered in agreement. In all other comparisons (i.e. demultiplexing versus doublet detecting and doublet detecting versus doublet detecting), only the droplet type (singlet or doublet) was considered for agreement since doublet detecting methods cannot annotate donor assignment. We found that the two method types were more similar to other methods of the same type (i.e., demultiplexing versus demultiplexing and doublet detecting versus doublet detecting) than they were to methods from a different type (demultiplexing methods versus doublet detecting methods; Supplementary Fig 1). We found that the similarity of the demultiplexing and doublet detecting methods was consistent in the PBMC and fibroblast datasets (Pearson correlation R = 0.78, P -value < 2×10 −16 ; Fig S1a-c). In addition, demultiplexing methods were more similar than doublet detecting methods for both the PBMC and fibroblast datasets (Wilcoxon rank-sum test: P < 0.01; Fig. 2 a–b and Additional file 2 : Fig S1).

figure 2

Demultiplexing and Doublet Detecting Method Performance Comparison. a  The proportion of droplets classified as singlets and doublets by each method in the PBMCs. b  The number of other methods that classified the singlets and doublets identified by each method in the PBMCs. c  The proportion of droplets classified as singlets and doublets by each method in the fibroblasts. d The number of other methods that classified the singlets and doublets identified by each method in the fibroblasts. e - f The performance of each method when the majority classification of each droplet is considered the correct annotation in the PBMCs ( e ) and fibroblasts ( f ). g - h  The number of droplets classified as singlets (box plots) and doublets (bar plots) by all methods in the PBMC ( g ) and fibroblast ( h ) pools. i - j  The number of donors that were not identified by each method in each pool for PBMCs ( i ) and fibroblasts ( j ). PBMC: peripheral blood mononuclear cell. MCC: Matthew’s correlationcoefficient

The number of unique molecular identifiers (UMIs) and genes decreased in droplets that were classified as singlets by a larger number of methods while the mitochondrial percentage increased in both PBMCs and fibroblasts (Additional file 2 : Fig S2).

We next interrogated the performance of each method using the Matthew’s correlation coefficient (MCC) to calculate the consistency between Demuxify and true droplet classification. We identified consistent trends in the MCC scores for each method between the PBMCs (Fig. 2 e) and fibroblasts (Fig. 2 f). These data indicate that the methods behave similarly, relative to one another, for heterogeneous and homogeneous datasets.

Next, we sought to identify the droplets concordantly classified by all demultiplexing and doublet detecting methods in the PBMC and fibroblast datasets. On average, 732 singlets were identified for each individual by all the methods in the PBMC dataset. Likewise, 494 droplets were identified as singlets for each individual by all the methods in the fibroblast pools. However, the concordance of doublets identified by all methods was very low for both datasets (Fig. 2 e–f). Notably, the consistency of classifying a droplet as a doublet by all methods was relatively low (Fig. 2 b,d,g, and h). This suggests that doublet identification is not consistent between all the methods. Therefore, further investigation is required to identify the reasons for these inconsistencies between methods. It also suggests that combining multiple methods for doublet classification may be necessary for more complete doublet removal. Further, some methods could not identify all the individuals in each pool (Fig. 2 i–j). The non-concordance between different methods demonstrates the need to effectively test each method on a dataset where the droplet types are known.

Computational resources vary for demultiplexing and doublet detecting methods

We recorded each method’s computational resources for the PBMC pools, with ~20,000 cells captured per pool (Additional file 1 : Table S1). Of the demultiplexing methods, ScSplit took the most time (multiple days) and required the most steps, but Demuxalot , Demuxlet , and Freemuxlet used the most memory. Solo took the longest time (median 13 h) and most memory to run for the doublet detecting methods but is the only method built to be run directly from the command line, making it easy to implement (Additional file 2 : Fig S3).

Generate pools with known singlets and doublets

However, there is no gold standard to identify which droplets are singlets or doublets. Therefore, in the second phase of our experimental design (Fig. 1 b), we used the PBMC droplets classified as singlets by all methods to generate new pools in silico. We chose to use the PBMC dataset since our first analyses indicated that method performance is similar for homogeneous (fibroblast) and heterogeneous (PBMC) cell types (Fig. 2 and Additional file 2 : Fig S1) and because we had many more individuals available to generate in silico pools from the PBMC dataset (Additional file 1 : Table S1).

We generated 70 pools—10 each of pools that included 2, 4, 8, 16, 32, 64, or 128 individuals (Additional file 1 : Table S2). We assume a maximum 20% doublet rate as it is unlikely researchers would use a technology that has a higher doublet rate (Fig. 3 a).

figure 3

In silico Pool Doublet Annotation and Method Performance. a  The percent of singlets and doublets in the in -silico pools - separated by the number of multiplexed individuals per pool. b  The percentage and number of doublets that are heterogenic (detectable by demultiplexing methods), heterotypic (detectable by doublet detecting methods), both (detectable by either method category) and neither (not detectable with current methods) for each multiplexed pool size. c  Percent of droplets that each of the demultiplexing and doublet detecting methods classified correctly for singlets and doublet subtypes for different multiplexed pool sizes. d  Matthew’s Correlation Coefficient (MCC) for each of the methods for each of the multiplexed pool sizes. e  Balanced accuracy for each of the methods for each of the multiplexed pool sizes

We used azimuth to classify the PBMC cell types for each droplet used to generate the in silico pools [ 19 ] (Additional file 2 : Fig S4). As these pools have been generated in silico using empirical singlets that have been well annotated, we next identified the proportion of doublets in each pool that were heterogenic, heterotypic, both, and neither. This approach demonstrates that a significant percentage of doublets are only detectable by doublet detecting methods (homogenic and heterotypic) for pools with 16 or fewer donors multiplexed (Fig. 3 b).

While the total number of doublets that would be missed if only using demultiplexing methods appears small for fewer multiplexed individuals (Fig. 3 b), it is important to recognise that this is partly a function of the ~732 singlet cells per individual used to generate these pools. Hence, the in silico pools with fewer individuals also have fewer cells. Therefore, to obtain numbers of doublets that are directly comparable to one another, we calculated the number of each doublet type that would be expected to be captured with 20,000 cells when 2, 4, 8, 16, or 32 individuals were multiplexed (Additional file 2 : Fig S5). These results demonstrate that many doublets would be falsely classified as singlets since they are homogenic when just using demultiplexing methods for a pool of 20,000 cells captured with a 16% doublet rate (Additional file 2 : Fig S5). However, as more individuals are multiplexed, the number of droplets that would not be detectable by demultiplexing methods (homogenic) decreases. This suggests that typical workflows that use only one demultiplexing method to remove doublets from pools that capture 20,000 droplets with 16 or fewer multiplexed individuals fail to adequately remove between 173 (16 multiplexed individuals) and 1,325 (2 multiplexed individuals) doublets that are homogenic and heterotypic which could be detected by doublet detecting methods (Additional file 2 : Fig S5). Therefore, a technique that uses both demultiplexing and doublet detecting methods in parallel will complement more complete doublet removal methods. Consequently, we next set out to identify the demultiplexing and doublet detecting methods that perform the best on their own and in concert with other methods.

Doublet and singlet droplet classification effectiveness varies for demultiplexing and doublet detecting methods

Demultiplexing methods fail to classify homogenic doublets.

We next investigated the percentage of the droplets that were correctly classified by each demultiplexing and doublet detecting method. In addition to the seven demultiplexing methods, we also included Demuxalot with the additional steps to refine the genotypes that can then be used for demultiplexing— Demuxalot (refined). Demultiplexing methods correctly classify a large portion of the singlets and heterogenic doublets (Fig. 3 c). This pattern is highly consistent across different cell types, with the notable exceptions being decreased correct classifications for erythrocytes and platelets when greater than 16 individuals are multiplexed (Additional file 2 : Fig S6).

However, Demuxalot consistently demonstrates the highest correct heterogenic doublet classification. Further, the percentage of the heterogenic doublets classified correctly by Souporcell decreases when large numbers of donors are multiplexed. ScSplit is not as effective as the other demultiplexing methods at classifying heterogenic doublets, partly due to the unique doublet classification method, which assumes that the doublets will generate a single cluster separate from the donors (Table 1 ). Importantly, the demultiplexing methods identify almost none of the homogenic doublets for any multiplexed pool size—demonstrating the need to include doublet detecting methods to supplement the demultiplexing method doublet detection.

Doublet detecting method classification performances vary greatly

In addition to assessing each of the methods with default settings, we also evaluated ScDblFinder with ‘known doublets’ provided. This method can take already known doublets and use them when detecting doublets. For these cases, we used the droplets that were classified as doublets by all the demultiplexing methods as ‘known doublets’.

Most of the methods classified a similarly high percentage of singlets correctly, with the exceptions of DoubletDecon and DoubletFinder for all pool sizes (Fig. 3 c). However, unlike the demultiplexing methods, there are explicit cell-type-specific biases for many of the doublet detecting methods (Additional file 2 : Fig S7). These differences are most notable for cell types with fewer cells (i.e. ASDC and cDC2) and proliferating cells (i.e. CD4 Proliferating, CD8 Proliferating, and NK Proliferating). Further, all of the softwares demonstrate high correct percentages for some cell types including CD4 Naïve and CD8 Naïve (Additional file 2 : Fig S7).

As expected, all doublet detecting methods identified heterotypic doublets more effectively than homotypic doublets (Fig. 3 c). However, ScDblFinder and Scrublet classified the most doublets correctly across all doublet types for pools containing 16 individuals or fewer. Solo was more effective at identifying doublets than Scds for pools containing more than 16 individuals. It is also important to note that it was not feasible to run DoubletDecon for the largest pools containing 128 multiplexed individuals and an average of 115,802 droplets (range: 113,594–119,126 droplets). ScDblFinder performed similarly when executed with and without known doublets (Pearson correlation P = 2.5 × 10 -40 ). This suggests that providing known doublets to ScDblFinder does not offer an added benefit.

Performances vary between demultiplexing and doublet detecting method and across the number of multiplexed individuals

We assessed the overall performance of each method with two metrics: the balanced accuracy and the MCC. We chose to use balanced accuracy since, with unbalanced group sizes, it is a better measure of performance than accuracy itself. Further, the MCC has been demonstrated as a more reliable statistical measure of performance since it considers all possible categories—true singlets (true positives), false singlets (false positives), true doublets (true negatives), and false doublets (false negatives). Therefore, a high score on the MCC scale indicates high performance in each metric. However, we provide additional performance metrics for each method (Additional file 1 : Table S3). For demultiplexing methods, both the droplet type (singlet or doublet) and the individual assignment were required to be considered a ‘true singlet’. In contrast, only the droplet type (singlet or doublet) was needed for doublet detection methods.

The MCC and balanced accuracy metrics are similar (Spearman’s ⍴ = 0.87; P < 2.2 × 10 -308 ). Further, the performance of Souporcell decreases for pools with more than 32 individuals multiplexed for both metrics (Student’s t -test for MCC: P < 1.1 × 10 -9 and balanced accuracy: P < 8.1 × 10 -11 ). Scds , ScDblFinder , and Scrublet are among the top-performing doublet detecting methods Fig. 3 d–e).

Overall, between 0.4 and 78.8% of droplets were incorrectly classified by the demultiplexing or doublet detecting methods depending on the technique and the multiplexed pool size (Additional file 2 : Fig S8). Demuxalot (refined) and DoubletDetection demonstrated the lowest percentage of incorrect droplets with about 1% wrong in the smaller pools (two multiplexed individuals) and about 3% incorrect in pools with at least 16 multiplexed individuals. Since some transitional states and cell types are present in low percentages in total cell populations (i.e. ASDCs at 0.02%), incorrect classification of droplets could alter scientific interpretations of the data, and it is, therefore, ideal for decreasing the number of erroneous assignments as much as possible.

False singlets and doublets demonstrate different metrics than correctly classified droplets

We next asked whether specific cell metrics might contribute to false singlet and doublet classifications for different methods. Therefore, we compared the number of genes, number of UMIs, mitochondrial percentage and ribosomal percentage of the false singlets and doublets to equal numbers of correctly classified cells for each demultiplexing and doublet detecting method.

The number of UMIs (Additional file 2 : Fig S9 and Additional file 1 : Table S4) and genes (Additional file 2 : Fig S10 and Additional file 1 : Table S5) demonstrated very similar distributions for all comparisons and all methods (Spearman ⍴ = 0.99, P < 2.2 × 10 -308 ). The number of UMIs and genes were consistently higher in false singlets and lower in false doublets for most demultiplexing methods except some smaller pool sizes (Additional file 2 : Fig S9a and Additional file 2 : Fig S10a; Additional file 1 : Table S4 and Additional file 1 : Table S5). The number of UMIs and genes was consistently higher in droplets falsely classified as singlets by the doublet detecting methods than the correctly identified droplets (Additional file 2 : Fig S9b and Additional file 2 : Fig S10b; Additional file 1 : Table S4 and Additional file 1 : Table S5). However, there was less consistency in the number of UMIs and genes detected in false singlets than correctly classified droplets between the different doublet detecting methods (Additional file 2 : Fig S9b and Additional file 2 : Fig S10b; Additional file 1 : Table S4 and Additional file 1 : Table S5).

The ribosomal percentage of the droplets falsely classified as singlets or doublets is similar to the correctly classified droplets for most methods—although they are statistically different for larger pool sizes (Additional file 2 : Fig S11a and Additional file 1 : Table S6). However, the false doublets classified by some demultiplexing methods ( Demuxalot , Demuxalot (refined), Demuxlet , ScSplit , Souporcell , and Vireo ) demonstrated higher ribosomal percentages. Some doublet detecting methods ( ScDblFinder , ScDblFinder with known doublets and Solo) demonstrated higher ribosomal percentages for the false doublets while other demonstrated lower ribosomal percentages ( DoubletDecon , DoubletDetection , and DoubletFinder ; Additional file 2 : Fig S11b and Additional file 1 : Table S6).

Like the ribosomal percentage, the mitochondrial percentage in false singlets is also relatively similar to correctly classified droplets for both demultiplexing (Additional file 2 : Fig S12a and Additional file 1 : Table S7) and doublet detecting methods (Additional file 2 : Fig S12b). The mitochondrial percentage for false doublets is statistically lower than the correctly classified droplets for a few larger pools for Freemuxlet , ScSplit , and Souporcell . The doublet detecting method Solo also demonstrates a small but significant decrease in mitochondrial percentage in the false doublets compared to the correctly annotated droplets. However, other doublet detecting methods including DoubletFinder and the larger pools of most other methods demonstrated a significant increase in mitochondrial percent in the false doublets compared to the correctly annotated droplets (Additional file 2 : Fig S12b).

Overall, these results demonstrate a strong relationship between the number of genes and UMIs and limited influence of ribosomal or mitochondrial percentage in a droplet and false classification, suggesting that the number of genes and UMIs can significantly bias singlet and doublet classification by demultiplexing and doublet detecting methods.

Ambient RNA, number of reads per cell, and uneven pooling impact method performance

To further quantify the variables that impact the performance of each method, we simulated four conditions that could occur with single-cell RNA-seq experiments: (1) decreased number of reads (reduced 50%), (2) increased ambient RNA (10%, 20% and 50%), (3) increased mitochondrial RNA (5%, 10% and 25%) and 4) uneven donor pooling from single donor spiking (0.5 or 0.75 proportion of pool from one donor). We chose these scenarios because they are common technical effects that can occur.

We observed a consistent decrease in the demultiplexing method performance when the number of reads were decreased by 50% but the degree of the effect varied for each method and was larger in pools containing more multiplexed donors (Additional file 2 : Fig S13a and Additional file 1 : Table S8). Decreasing the number of reads did not have a detectable impact on the performance of the doublet detecting methods.

Simulating additional ambient RNA (10%, 20%, or 50%) decreased the performance of all the demultiplexing methods (Additional file 2 : Fig S13b and Additional file 1 : Table S9) but some were unimpacted in pools that had 16 or fewer individuals multiplexed ( Souporcell and Vireo ). The performance of some of the doublet detecting methods were impacted by the ambient RNA but the performance of most methods did not decrease. Scrublet and ScDblFinder were the doublet detecting methods most impacted by ambient RNA but only in pools with at least 32 multiplexed donors (Additional file 2 : Fig S13b and Additional file 1 : Table S9).

Increased mitochondrial percent did not impact the performance of demultiplexing or doublet detecting methods (Additional file 2 : Fig S13c and Additional file 1 : Table S10).

We also tested whether experimental designs that pooling uneven proportions of donors would alter performance. We tested scenarios where either half the pool was composed of a single donor (0.5 spiked donor proportion) or where three quarters of the pool was composed of a single donor. This experimental design significantly reduced the demultiplexing method performance (Additional file 2 : Fig S13d and Additional file 1 : Table S11) with the smallest influence on Freemuxlet . The performance of most of the doublet detecting methods were unimpacted except for DoubletDetection that demonstrated significant decreases in performance in pools where at least 16 donors were multiplexed. Intriguingly, the performance of Solo increased with the spiked donor pools when the pools consisted of 16 donors or less.

Our results demonstrate significant differences in overall performance between different demultiplexing and doublet detecting methods. We further noticed some differences in the use of the methods. Therefore, we have accumulated these results and each method’s unique characteristics and benefits in a heatmap for visual interpretation (Fig. 4 ).

figure 4

Assessment of each of the demultiplexing and doublet detecting methods. Assessments of a variety of metrics for each of the demultiplexing (top) and doublet detecting (bottom) methods

Framework for improving singlet classifications via method combinations

After identifying the demultiplexing and doublet detecting methods that performed well individually, we next sought to test whether using intersectional combinations of multiple methods would enhance droplet classifications and provide a software platform— Demuxafy —capable of supporting the execution of these intersectional combinations.

We recognise that different experimental designs will be required for each project. As such, we considered this when testing combinations of methods. We considered multiple experiment designs and two different intersectional methods: (1) more than half had to classify a droplet as a singlet to be called a singlet and (2) at least half of the methods had to classify a droplet as a singlet to be called a singlet. Significantly, these two intersectional methods only differ when an even number of methods are being considered. For combinations that include demultiplexing methods, the individual called by the majority of the methods is the individual used for that droplet. When ties occur, the individual is considered ‘unassigned’.

Combining multiple doublet detecting methods improve doublet removal for non-multiplexed experimental designs

For the non-multiplexed experimental design, we considered all possible method combinations (Additional file 1 : Table S12). We identified important differences depending on the number of droplets captured and have provided recommendations accordingly. We identified that DoubletFinder , Scrublet , ScDblFinder and Scds is the ideal combination for balanced droplet calling when less than 2,000 droplets are captured. Scds and ScDblFinder or Scrublet , Scds and ScDblFinder is the best combination when 2,000–10,000 droplets are captured. Scds , Scrublet, ScDblFinder and DoubletDetection is the best combination when 10,000–20,000 droplets are captured and Scrublet , Scds , DoubletDetection and ScDblFinder . It is important to note that even a slight increase in the MCC significantly impacts the number of true singlets and true doublets classified with the degree of benefit highly dependent on the original method performance. The combined method increases the MCC compared to individual doublet detecting methods on average by 0.11 and up to 0.33—a significant improvement in the MCC ( t -test FDR < 0.05 for 95% of comparisons). For all combinations, the intersectional droplet method requires more than half of the methods to consider the droplet a singlet to classify it as a singlet (Fig. 5 ).

figure 5

Recommended Method Combinations Dependent on Experimental Design. Method combinations are provided for different experimental designs, including those that are not multiplexed (left) and multiplexed (right), including experiments that have reference SNP genotypes available vs those that do not and finally, multiplexed experiments with different numbers of individuals multiplexed. The each bar represents either a single method (shown with the coloured icon above the bar) or a combination of methods (shown with the addition of the methods and an arrow indicating the bar). The proportion of true singlets, true doublets, false singlets and false doublets for each method or combination of methods is shown with the filled barplot and the MCC is shown with the black points overlaid on the barplot. MCC: Matthew’s Correlation Coefficient

Demuxafy performs better than Chord

Chord is an ensemble machine learning doublet detecting method that uses Scds and DoubletFinder to identify doublets. We compared Demuxafy using Scds and DoubletFinder to Chord and identified that Demuxafy outperformed Chord in pools that contained at least eight donors and was equivalent in pools that contained less than eight donors (Additional file 2 : Fig S14). This is because Chord classifies more droplets as false singlets and false doublets than Demuxafy . In addition, Chord failed to complete for two of the pools that contained 128 multiplexed donors.

Combining multiple demultiplexing and doublet detecting methods improve doublet removal for multiplexed experimental designs

For experiments where 16 or fewer individuals are multiplexed with reference SNP genotypes available, we considered all possible combinations between the demultiplexing and doublet detecting methods except ScDblFinder with known doublets due to its highly similar performance to ScDblFinder (Fig 3 ; Additional file 1 : Table S13). The best combinations are DoubletFinder , Scds , ScDblFinder , Vireo and Demuxalot (refined) (<~5 donors) and Scrublet , ScDblFinder , DoubletDetection , Dropulation and Demuxalot (refined) (Fig. 5 ). These intersectional methods increase the MCC compared to the individual methods ( t -test FDR < 0.05), generally resulting in increased true singlets and doublets compared to the individual methods. The improvement in MCC depends on every single method’s performance but, on average, increases by 0.22 and up to 0.71. For experiments where the reference SNP genotypes are unknown, the individuals multiplexed in the pool with 16 or fewer individuals multiplexed, DoubletFinder , ScDblFinder, Souporcell and Vireo (<~5 donors) and Scds , ScDblFinder , DoubletDetection , Souporcell and Vireo are the ideal methods (Fig. 5 ). These intersectional methods again significantly increase the MCC up to 0.87 compared to any of the individual techniques that could be used for this experimental design ( t -test FDR < 0.05 for 94.2% of comparisons). In both cases, singlets should only be called if more than half of the methods in the combination classify the droplet as a singlet.

Combining multiple demultiplexing methods improves doublet removal for large multiplexed experimental designs

For experiments that multiplex more than 16 individuals, we considered the combinations between all demultiplexing methods (Additional file 1 : Table S14) since only a small proportion of the doublets would be undetectable by demultiplexing methods (droplets that are homogenic; Fig 3 b). To balance doublet removal and maintain true singlets, we recommend the combination of Demuxalot (refined) and Dropulation . These method combinations significantly increase the MCC by, on average, 0.09 compared to all the individual methods ( t -test FDR < 0.05). This substantially increases true singlets and true doublets relative to the individual methods. If reference SNP genotypes are not available for the individuals multiplexed in the pools, Vireo performs the best (≥ 16 multiplexed individuals; Fig. 5 ). This is the only scenario in which executing a single method is advantageous to a combination of methods. This is likely due to the fact that most of the methods perform poorly for larger pool sizes (Fig. 3 c).

These results collectively demonstrate that, regardless of the experimental design, demultiplexing and doublet detecting approaches that intersect multiple methods significantly enhance droplet classification. This is consistent across different pool sizes and will improve singlet annotation.

Demuxafy improves doublet removal and improves usability

To make our intersectional approaches accessible to other researchers, we have developed Demuxafy ( https://demultiplexing-doublet-detecting-docs.readthedocs.io/en/latest/index.html ) - an easy-to-use software platform powered by Singularity. This platform provides the requirements and instructions to execute each demultiplexing and doublet detecting methods. In addition, Demuxafy provides wrapper scripts that simplify method execution and effectively summarise results. We also offer tools that help estimate expected numbers of doublets and provide method combination recommendations based on scRNA-seq pool characteristics. Demuxafy also combines the results from multiple different methods, provides classification combination summaries, and provides final integrated combination classifications based on the intersectional techniques selected by the user. The significant advantages of Demuxafy include a centralised location to execute each of these methods, simplified ways to combine methods with an intersectional approach, and summary tables and figures that enable practical interpretation of multiplexed datasets (Fig. 1 a).

Demultiplexing and doublet detecting methods have made large-scale scRNA-seq experiments achievable. However, many demultiplexing and doublet detecting methods have been developed in the recent past, and it is unclear how their performances compare. Further, the demultiplexing techniques best detect heterogenic doublets while doublet detecting methods identify heterotypic doublets. Therefore, we hypothesised that demultiplexing and doublet detecting methods would be complementary and be more effective at removing doublets than demultiplexing methods alone.

Indeed, we demonstrated the benefit of utilising a combination of demultiplexing and doublet detecting methods. The optimal intersectional combination of methods depends on the experimental design and capture characteristics. Our results suggest super loaded captures—where a high percentage of doublets is expected—will benefit from multiplexing. Further, when many donors are multiplexed (>16), doublet detecting is not required as there are few doublets that are homogenic and heterotypic.

We have provided different method combination recommendations based on the experimental design. This decision is highly dependent on the research question.

Conclusions

Overall, our results provide researchers with important demultiplexing and doublet detecting performance assessments and combinatorial recommendations. Our software platform, Demuxafy ( https://demultiplexing-doublet-detecting-docs.readthedocs.io/en/latest/index.html ), provides a simple implementation of our methods in any research lab around the world, providing cleaner scRNA-seq datasets and enhancing interpretation of results.

PBMC scRNA-seq data

Blood samples were collected and processed as described previously [ 17 ]. Briefly, mononuclear cells were isolated from whole blood samples and stored in liquid nitrogen until thawed for scRNA-seq capture. Equal numbers of cells from 12 to 16 samples were multiplexed per pool and single-cell suspensions were super loaded on a Chromium Single Cell Chip A (10x Genomics) to capture 20,000 droplets per pool. Single-cell libraries were processed per manufacturer instructions and the 10× Genomics Cell Ranger Single Cell Software Suite (v 2.2.0) was used to process the data and map it to GRCh38. Cellbender v0.1.0 was used to identify empty droplets. Almost all droplets reported by Cell Ranger were identified to contain cells by Cellbender (mean: 99.97%). The quality control metrics of each pool are demonstrated in Additional file 2 : Fig S15.

PBMC DNA SNP genotyping

SNP genotype data were prepared as described previously [ 17 ]. Briefly, DNA was extracted from blood with the QIAamp Blood Mini kit and genotyped on the Illumina Infinium Global Screening Array. SNP genotypes were processed with Plink and GCTA before imputing on the Michigan Imputation Server using Eagle v2.3 for phasing and Minimac3 for imputation based on the Haplotype Reference Consortium panel (HRCr1.1). SNP genotypes were then lifted to hg38 and filtered for > 1% minor allele frequency (MAF) and an R 2 > 0.3.

Fibroblast scRNA-seq data

The fibroblast scRNA-seq data has been described previously [ 18 ]. Briefly, human skin punch biopsies from donors over the age of 18 were cultured in DMEM high glucose supplemented with 10% fetal bovine serum (FBS), L-glutamine, 100 U/mL penicillin and 100 μg/mL (Thermo Fisher Scientific, USA).

For scRNA-seq, viable cells were flow sorted and single cell suspensions were loaded onto a 10× Genomics Single Cell 3’ Chip and were processed per 10× instructions and the Cell Ranger Single Cell Software Suite from 10× Genomics was used to process the sequencing data into transcript count tables as previously described [ 18 ]. Cellbender v0.1.0 was used to identify empty droplets. Almost all droplets reported by Cell Ranger were identified to contain cells by Cellbender (mean: 99.65%). The quality control metrics of each pool are demonstrated in Additional file 2 : Fig S16.

Fibroblast DNA SNP genotyping

The DNA SNP genotyping for fibroblast samples has been described previously [ 18 ]. Briefly, DNA from each donor was genotyped on an Infinium HumanCore-24 v1.1 BeadChip (Illumina). GenomeStudioTM V2.0 (Illumina), Plink and GenomeStudio were used to process the SNP genotypes. Eagle V2.3.5 was used to phase the SNPs and it was imputed with the Michigan Imputation server using minimac3 and the 1000 genome phase 3 reference panel as described previously [ 18 ].

Demultiplexing methods

All the demultiplexing methods were built and run from a singularity image.

Demuxalot [ 6 ] is a genotype reference-based single cell demultiplexing method. Demualot v0.2.0 was used in python v3.8.5 to annotate droplets. The likelihoods, posterior probabilities and most likely donor for each droplet were estimated using the Demuxalot Demultiplexer.predict_posteriors function. We also used Demuxalot Demultiplexer.learn_genotypes function to refine the genotypes before estimating the likelihoods, posterior probabilities and likely donor of each droplet with the refined genotypes as well.

The Popscle v0.1-beta suite [ 16 ] for population genomics in single cell data was used for Demuxlet and Freemuxlet demultiplexing methods. The popscle dsc-pileup function was used to create a pileup of variant calls at known genomic locations from aligned sequence reads in each droplet with default arguments.

Demuxlet [ 3 ] is a SNP genotype reference-based single cell demultiplexing method. Demuxlet was run with a genotype error coefficient of 1 and genotype error offset rate of 0.05 and the other default parameters using the popscle demuxlet command from Popscle (v0.1-beta).

Freemuxlet [ 16 ] is a SNP genotype reference-free single cell demultiplexing method. Freemuxlet was run with default parameters including the number of samples included in the pool using the popscle freemuxlet command from Popscle (v0.1-beta).

Dropulation

Dropulation [ 5 ] is a SNP genotype reference-based single cell demultiplexing method that is part of the Drop-seq software. Dropulation from Drop-seq v2.5.1 was implemented for this manuscript. In addition, the method for calling singlets and doublets was provided by the Dropulation developer and implemented in a custom R script available on Github and Zenodo (see “Availability of data and materials”).

ScSplit v1.0.7 [ 7 ] was downloaded from the ScSplit github and the recommended steps for data filtering quality control prior to running ScSplit were followed. Briefly, reads that had read quality lower than 10, were unmapped, were secondary alignments, did not pass filters, were optical PCR duplicates or were duplicate reads were removed. The resulting bam file was then sorted and indexed followed by freebayes to identify single nucleotide variants (SNVs) in the dataset. The resulting SNVs were filtered for quality scores greater than 30 and for variants present in the reference SNP genotype vcf. The resulting filtered bam and vcf files were used as input for the s cSplit count command with default settings to count the number of reference and alternative alleles in each droplet. Next the allele matrices were used to demultiplex the pool and assign cells to different clusters using the scSplit run command including the number of individuals ( -n ) option and all other options set to default. Finally, the individual genotypes were predicted for each cluster using the scSplit genotype command with default parameters.

Souporcell [ 4 ] is a SNP genotype reference-free single cell demultiplexing method. The Souporcell v1.0 singularity image was downloaded via instructions from the gihtub page. The Souporcell pipeline was run using the souporcell_pipeline.py script with default options and the option to include known variant locations ( --common_variants ).

Vireo [ 2 ] is a single cell demultiplexing method that can be used with reference SNP genotypes or without them. For this assessment, Vireo was used with reference SNP genotypes. Per Vireo recommendations, we used model 1 of the cellSNP [ 20 ] version 0.3.2 to make a pileup of SNPs for each droplet with the recommended options using the genotyped reference genotype file as the list of common known SNP and filtered with SNP locations that were covered by at least 20 UMIs and had at least 10% minor allele frequency across all droplets. Vireo version 0.4.2 was then used to demultiplex using reference SNP genotypes and indicating the number of individuals in the pools.

Doublet detecting methods

All doublet detecting methods were built and run from a singularity image.

DoubletDecon

DoubletDecon [ 9 ] is a transcription-based deconvolution method for identifying doublets. DoubletDecon version 1.1.6 analysis was run in R version 3.6.3. SCTransform [ 21 ] from Seurat [ 22 ] version 3.2.2 was used to preprocess the scRNA-seq data and then the Improved_Seurat_Pre_Process function was used to process the SCTransformed scRNA-seq data. Clusters were identified using Seurat function FindClusters with resolution 0.2 and 30 principal components (PCs). Then the Main_Doublet_Decon function was used to deconvolute doublets from singlets for six different rhops—0.6, 0.7, 0.8, 0.9, 1.0 and 1.1. We used a range of rhop values since the doublet annotation by DoubletDecon is dependent on the rhop parameter which is selected by the user. The rhop that resulted in the closest number of doublets to the expected number of doublets was selected on a per-pool basis and used for all subsequent analysis. Expected number of doublets were estimated with the following equation:

where N is the number of droplets captured and D is the number of expected doublets.

DoubletDetection

DoubletDetection [ 14 ] is a transcription-based method for identifying doublets. DoubletDetection version 2.5.2 analysis was run in python version 3.6.8. Droplets without any UMIs were removed before analysis with DoubletDetection . Then the doubletdetection.BoostClassifier function was run with 50 iterations with use_phenograph set to False and standard_scaling set to True. The predicted number of doublets per iteration was visualised across all iterations and any pool that did not converge after 50 iterations, it was run again with increasing numbers of iterations until they reached convergence.

DoubletFinder

DoubletFinder [ 10 ] is a transcription-based doublet detecting method. DoubletFinder version 2.0.3 was implemented in R version 3.6.3. First, droplets that were more than 3 median absolute deviations (mad) away from the median for mitochondrial per cent, ribosomal per cent, number of UMIs or number of genes were removed per developer recommendations. Then the data was normalised with SCTransform followed by cluster identification using FindClusters with resolution 0.3 and 30 principal components (PCs). Then, pKs were selected by the pK that resulted in the largest BC MVN as recommended by DoubletFinder. The pK vs BC MVN relationship was visually inspected for each pool to ensure an effective BC MVN was selected for each pool. Finally, the homotypic doublet proportions were calculated and the number of expected doublets with the highest doublet proportion were classified as doublets per the following equation:

ScDblFinder

ScDblFinder [ 11 ] is a transcription-based method for detecting doublets from scRNA-seq data. ScDblFinder 1.3.25 was implemented in R version 4.0.3. ScDblFinder was implemented with two sets of options. The first included implementation with the expected doublet rate as calculated by:

where N is the number of droplets captured and R is the expected doublet rate. The second condition included the same expected number of doublets and included the doublets that had already been identified by all the demultiplexing methods.

Scds [ 12 ] is a transcription-based doublet detecting method. Scds version 1.1.2 analysis was completed in R version 3.6.3. Scds was implemented with the cxds function and bcds functions with default options followed by the cxds_bcds_hybrid with estNdbl set to TRUE so that doublets will be estimated based on the values from the cxds and bcds functions.

Scrublet [ 13 ] is a transcription-based doublet detecting method for single-cell RNA-seq data. Scrublet was implemented in python version 3.6.3. Scrublet was implemented per developer recommendations with at least 3 counts per droplet, 3 cells expressing a given gene, 30 PCs and a doublet rate based on the following equation:

where N is the number of droplets captured and R is the expected doublet rate. Four different minimum number of variable gene percentiles: 80, 85, 90 and 95. Then, the best variable gene percentile was selected based on the distribution of the simulated doublet scores and the location of the doublet threshold selection. In the case that the selected threshold does not fall between a bimodal distribution, those pools were run again with a manual threshold set.

Solo [ 15 ] is a transcription-based method for detecting doublets in scRNA-seq data. Solo was implemented with default parameters and an expected number of doublets based on the following equation:

where N is the number of droplets captured and D is the number of expected doublets. Solo was additionally implemented in a second run for each pool with the doublets that were identified by all the demultiplexing methods as known doublets to initialize the model.

In silico pool generation

Cells that were identified as singlets by all methods were used to simulate pools. Ten pools containing 2, 4, 8, 16, 32, 64 and 128 individuals were simulated assuming a maximum 20% doublet rate as it is unlikely researchers would use a technology that has a higher doublet rate. The donors for each simulated pool were randomly selected using a custom R script which is available on Github and Zenodo (see ‘Availability of data and materials’). A separate bam for the cell barcodes for each donor was generated using the filterbarcodes function from the sinto package (v0.8.4). Then, the GenerateSyntheticDoublets function provided by the Drop-seq [ 5 ] package was used to simulate new pools containing droplets with known singlets and doublets.

Twenty-one total pools—three pools from each of the different simulated pool sizes (2, 4, 8, 16, 32, 64 and 128 individuals) —were used to simulate different experimental scenarios that may be more challenging for demultiplexing and doublet detecting methods. These include simulating higher ambient RNA, higher mitochondrial percent, decreased read coverage and imbalanced donor proportions as described subsequently.

High ambient RNA simulations

Ambient RNA was simulated by changing the barcodes and UMIs on a random selection of reads for 10, 20 or 50% of the total UMIs. This was executed with a custom R script that is available in Github and Zenodo (see ‘Availability of data and materials’).

High mitochondrial percent simulations

High mitochondrial percent simulations were produced by replacing reads in 5, 10 or 25% of the randomly selected cells with mitochondrial reads. The number of reads to replace was derived from a normal distribution with an average of 30 and a standard deviation of 3. This was executed with a custom R script available in Github and Zenodo (see ‘Availability of data and materials’).

Imbalanced donor simulations

We simulated pools that contained uneven proportions of the donors in the pools to identify if some methods are better at demultiplexing pools containing uneven proportions of each donor in the pool. We simulated pools where 50, 75 or 95% of the pool contained cells from a single donor and the remainder of the pool was even proportions of the remaining donors in the pool. This was executed with a custom R script available in Github and Zenodo (see ‘Availability of data and materials’).

Decrease read coverage simulations

Decreased read coverage of pools was simulated by down-sampling the reads by two-thirds of the original coverage.

Classification annotation

Demultiplexing methods classifications were considered correct if the droplet annotation (singlet or doublet) and the individual annotation was correct. If the droplet type was correct but the individual annotation was incorrect (i.e. classified as a singlet but annotated as the wrong individual), then the droplet was incorrectly classified.

Doublet detecting methods were considered to have correct classifications if the droplet annotation matched the known droplet type.

All downstream analyses were completed in R version 4.0.2.

Availability of data and materials

All data used in this manuscript is publicly available. The PBMC data is available on GEO (Accession: GSE196830) [ 23 ] as originally described in [ 17 ]. The fibroblast data is available on ArrayExpress (Accession Number: E-MTAB-10060) [ 24 ] and as originally described in [ 18 ]. The code used for the analyses in this manuscript are provided on Github ( https://github.com/powellgenomicslab/Demuxafy_manuscript/tree/v4 ) and Zenodo ( https://zenodo.org/records/10813452 ) under an MIT Open Source License [ 25 , 26 ]. Demuxafy is provided as a package with source code available on Github ( https://github.com/drneavin/Demultiplexing_Doublet_Detecting_Docs ) and instructions on ReadTheDocs ( https://demultiplexing-doublet-detecting-docs.readthedocs.io/en/latest/ ) under an MIT Open Source License [ 27 ]. Demuxafy is also available on Zenodo with the link https://zenodo.org/records/10870989 [ 28 ].

Zheng GXY, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017;8:1–12.

Article   Google Scholar  

Huang Y, McCarthy DJ, Stegle O. Vireo: Bayesian demultiplexing of pooled single-cell RNA-seq data without genotype reference. Genome Biol. 2019;20:273.

Article   PubMed   PubMed Central   Google Scholar  

Kang HM, Subramaniam M, Targ S, Nguyen M, Maliskova L, McCarthy E, et al. Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat Biotechnol. 2018;36:89–94.

Article   CAS   PubMed   Google Scholar  

Heaton H, Talman AM, Knights A, Imaz M, Gaffney DJ, Durbin R, et al. Souporcell: robust clustering of single-cell RNA-seq data by genotype without reference genotypes. Nat Methods. 2020;17:615–20.

Wells MF, Nemesh J, Ghosh S, Mitchell JM, Salick MR, Mello CJ, et al. Natural variation in gene expression and viral susceptibility revealed by neural progenitor cell villages. Cell Stem Cell. 2023;30:312–332.e13.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Rogozhnikov A, Ramkumar P, Shah K, Bedi R, Kato S, Escola GS. Demuxalot: scaled up genetic demultiplexing for single-cell sequencing. bioRxiv. 2021;2021.05.22.443646.

Xu J, Falconer C, Nguyen Q, Crawford J, McKinnon BD, Mortlock S, et al. Genotype-free demultiplexing of pooled single-cell RNA-seq. Genome Biol. 2019;20:290.

What is the maximum number of cells that can be profiled?. Available from: https://kb.10xgenomics.com/hc/en-us/articles/360001378811-What-is-the-maximum-number-of-cells-that-can-be-profiled -

DePasquale EAK, Schnell DJ, Van Camp PJ, Valiente-Alandí Í, Blaxall BC, Grimes HL, et al. DoubletDecon: deconvoluting doublets from single-cell RNA-sequencing data. Cell Rep. 2019;29:1718–1727.e8.

McGinnis CS, Murrow LM, Gartner ZJ. DoubletFinder: doublet detection in single-cell RNA sequencing data using artificial nearest neighbors. Cell Syst. 2019;8:329–337.e4.

Germain P-L, Lun A, Meixide CG, Macnair W, Robinson MD. Doublet identification in single-cell sequencing data. 2022;

Bais AS, Kostka D. Scds: Computational annotation of doublets in single-cell RNA sequencing data. Bioinformatics. 2020;36:1150–8.

Wolock SL, Lopez R, Klein AM. Scrublet: computational identification of cell doublets in single-cell transcriptomic data. Cell Syst. 2019;8:281–291.e9.

Shor, Jonathan. DoubletDetection. Available from: https://github.com/JonathanShor/DoubletDetection .

Bernstein NJ, Fong NL, Lam I, Roy MA, Hendrickson DG, Kelley DR. Solo: doublet identification in single-cell RNA-Seq via semi-supervised deep learning. Cell Syst. 2020;11:95–101.e5.

popscle. Available from: https://github.com/statgen/popscle .

Yazar S, Alquicira-Hernandez J, Wing K, Senabouth A, Gordon MG, Andersen S, et al. Single-cell eQTL mapping identifies cell type–specific genetic control of autoimmune disease. Science. 2022;376:eabf3041.

Neavin D, Nguyen Q, Daniszewski MS, Liang HH, Chiu HS, Senabouth A, et al. Single cell eQTL analysis identifies cell type-specific genetic control of gene expression in fibroblasts and reprogrammed induced pluripotent stem cells. Genome Biol. 2021;1–19.

Hao Y, Hao S, Andersen-Nissen E, Mauck WM, Zheng S, Butler A, et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573–3587.e29.

Huang X, Huang Y. Cellsnp-lite: an efficient tool for genotyping single cells. bioRxiv. 2021;2020.12.31.424913.

Hafemeister C, Satija R. Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. bioRxiv. 2019;576827.

Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck WM, et al. Comprehensive integration of single-cell data. Cell. 2019;177:1888–1902.e21.

Powell JE. Single-cell eQTL mapping identifies cell type specific genetic control of autoimmune disease. Datasets. Gene Expression Omnibus. 2022. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE196830 .

Nguyen Q, Powell JE. scRNA-seq in 79 fibroblast cell lines and 31 reprogrammed induced pluripotent stem cell lines for sceQTL analysis. Datasets. ArrayExpress. 2021. https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-10060?query=E-MTAB-10060 .

Neavin DR. Demuxafy analyses. Github. 2024. https://github.com/powellgenomicslab/Demuxafy_manuscript/tree/v4 .

Neavin DR. Demuxafy analyses. Zenodo. 2024. https://zenodo.org/records/10813452 .

Neavin D. Demuxafy. Github. 2024. https://github.com/drneavin/Demultiplexing_Doublet_Detecting_Docs .

Neavin D. Demuxafy. Zenodo. 2024.  https://zenodo.org/records/10870989 .

McCaughey T, Liang HH, Chen C, Fenwick E, Rees G, Wong RCB, et al. An interactive multimedia approach to improving informed consent for induced pluripotent stem cell research. Cell Stem Cell. 2016;18:307–8.

Download references

Authors’ Twitter handles

Twitter handles: @drneavin (Drew Neavin), @thjimmylee (Jimmy Tsz Hang Lee), @marta_mele_m (Marta Melé)

Peer review information

Wenjing She was the primary editor of this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Review history

The review history is available as Additional file 3 .

This work was funded by the National Health and Medical Research Council (NHMRC) Investigator grant (1175781), and funding from the Goodridge foundation. J.E.P is also supported by a fellowship from the Fok Foundation.

Author information

Authors and affiliations.

Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute for Medical Research, Darlinghurst, NSW, Australia

Drew Neavin, Anne Senabouth, Himanshi Arora & Joseph E. Powell

Present address: Statewide Genomics at NSW Health Pathology, Sydney, NSW, Australia

Himanshi Arora

Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK

Jimmy Tsz Hang Lee

Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Catalonia, Spain

Aida Ripoll-Cladellas & Marta Melé

Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands

Lude Franke

Spatial and Single Cell Systems Domain, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore

Shyam Prabhakar

Population and Global Health, Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Republic of Singapore

Cancer Science Institute of Singapore, National University of Singapore, Singapore, Republic of Singapore

Bakar Institute for Computational Health Sciences, University of California, San Francisco, CA, USA

Chun Jimmie Ye

Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA

Division of Rheumatology, Department of Medicine, University of California, San Francisco, San Francisco, CA, USA

Chan Zuckerberg Biohub, San Francisco, CA, USA

Bioinformatics and Cellular Genomics, St Vincent’s Institute of Medical Research, Fitzroy, Australia

Davis J. McCarthy

Melbourne Integrative Genomics, School of BioSciences–School of Mathematics & Statistics, Faculty of Science, University of Melbourne, Melbourne, Australia

Present address: The Gene Lay Institute of Immunology and Inflammation, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA

Martin Hemberg

UNSW Cellular Genomics Futures Institute, University of New South Wales, Kensington, NSW, Australia

Joseph E. Powell

You can also search for this author in PubMed   Google Scholar

sc-eQTLGen Consortium

Contributions.

DRN and JEP conceived the project idea and study design. JTHL, AR, LF, SP, CJY, DJM, MM and MH provided feedback on experimental design. DRN carried out analyses with support on coding from AS. JTHL and AR tested Demuxafy and provided feedback. DRN and JEP wrote the manuscript. All authors reviewed and provided feedback on the manuscript.

Corresponding authors

Correspondence to Drew Neavin or Joseph E. Powell .

Ethics declarations

Ethics approval and consent to participate.

Briefly, all work was approved by the Royal Hobart Hospital, the Hobart Eye Surgeons Clinic, Human Research Ethics Committees of the Royal Victorian Eye and Ear Hospital (11/1031), University of Melbourne (1545394) and University of Tasmania (H0014124) in accordance with the requirements of the National Health & Medical Research Council of Australia (NHMRC) and conformed with the Declaration of Helsinki [ 29 ].

Consent for publication

No personal data for any individual requiring consent for publication was included in this manuscript.

Competing interests

C.J.Y. is founder for and holds equity in DropPrint Genomics (now ImmunAI) and Survey Genomics, a Scientific Advisory Board member for and holds equity in Related Sciences and ImmunAI, a consultant for and holds equity in Maze Therapeutics, and a consultant for TReX Bio, HiBio, ImYoo, and Santa Ana. Additionally, C.J.Y is also newly an Innovation Investigator for the Arc Institute. C.J.Y. has received research support from Chan Zuckerberg Initiative, Chan Zuckerberg Biohub, Genentech, BioLegend, ScaleBio and Illumina.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: supplementary tables and legends., additional file 2: supplementary figures and legends., additional file 3..

Review history.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Neavin, D., Senabouth, A., Arora, H. et al. Demuxafy : improvement in droplet assignment by integrating multiple single-cell demultiplexing and doublet detection methods. Genome Biol 25 , 94 (2024). https://doi.org/10.1186/s13059-024-03224-8

Download citation

Received : 07 March 2023

Accepted : 25 March 2024

Published : 15 April 2024

DOI : https://doi.org/10.1186/s13059-024-03224-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Single-cell analysis
  • Genetic demultiplexing
  • Doublet detecting

Genome Biology

ISSN: 1474-760X

python while with assignment

IMAGES

  1. Python For Beginners

    python while with assignment

  2. Python While Loop [With Examples]

    python while with assignment

  3. Python 3 Tutorial: 11

    python while with assignment

  4. Assignment operators in python

    python while with assignment

  5. Python while loop

    python while with assignment

  6. Python Assignment Expression? The 13 Top Answers

    python while with assignment

VIDEO

  1. How to use assignment operators in #python

  2. Python Basics

  3. Python Assignment Operator #python #assignmentoperators #pythonoperators #operatorsinpython

  4. Assignment Operators in python #python #operator

  5. continue statement in Python while loop

  6. Python Assignment Operators And Comparison Operators

COMMENTS

  1. Assign variable in while loop condition in Python?

    Starting Python 3.8, and the introduction of assignment expressions (PEP 572) ( := operator), it's now possible to capture the condition value ( data.readline()) of the while loop as a variable ( line) in order to re-use it within the body of the loop: while line := data.readline(): do_smthg(line)

  2. How to use Python While with Assignment[4 Examples]

    First, we will use the "=" operator, an assignment operator in Python. This operator is used to assign a value to a variable, and we will assign a variable inside the loop in Python using the " = " operator. Let's see how Python, while with assignment, works. while True: line = "Hello". i = 0.

  3. Assignment Condition in Python While Loop

    Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), it's now possible to capture an expression value (here sys.stdin.read(1)) as a variable in order to use it within the body of while:

  4. Python Assign value to variable during condition in while Loop

    That said, as of Python 3.8 the language will actually have assignment expressions, using := as the assignment operator. See PEP 572. Assignment expressions are actually useful in list comprehensions, for example, when you need to both include a method return value in the list you are building and need to be able to use that value in a test.

  5. Python's Assignment Operator: Write Robust Assignments

    Here, variable represents a generic Python variable, while expression represents any Python object that you can provide as a concrete value—also known as a literal—or an expression that evaluates to a value. To execute an assignment statement like the above, Python runs the following steps: Evaluate the right-hand expression to produce a concrete value or object.

  6. How To Use Assignment Expressions in Python

    In this tutorial, you used assignment expressions to make compact sections of Python code that assign values to variables inside of if statements, while loops, and list comprehensions. For more information on other assignment expressions, you can view PEP 572 —the document that initially proposed adding assignment expressions to Python.

  7. Assignment Expressions: The Walrus Operator

    In this lesson, you'll learn about the biggest change in Python 3.8: the introduction of assignment expressions.Assignment expression are written with a new notation (:=).This operator is often called the walrus operator as it resembles the eyes and tusks of a walrus on its side.. Assignment expressions allow you to assign and return a value in the same expression.

  8. Python While Loop with Multiple Conditions • datagy

    What is a Python While Loop. A Python while loop is an example of iteration, meaning that some Python statement is executed a certain number of times or while a condition is true. A while loop is similar to a Python for loop, but it is executed different. A Python while loop is both an example of definite iteration, meaning that it iterates a ...

  9. The Walrus Operator: Python 3.8 Assignment Expressions

    Each new version of Python adds new features to the language. For Python 3.8, the biggest change is the addition of assignment expressions.Specifically, the := operator gives you a new syntax for assigning variables in the middle of expressions. This operator is colloquially known as the walrus operator.. This tutorial is an in-depth introduction to the walrus operator.

  10. Python while Loop (With Examples)

    while Loop Syntax while condition: # body of while loop. Here, The while loop evaluates the condition.; If the condition is true, body of while loop is executed. The condition is evaluated again. This process continues until the condition is False.; Once the condition evaluates to False, the loop terminates.

  11. Python While Loop

    print(count) count = count + 1. Run. In simple words, The while loop enables the Python program to repeat a set of operations while a particular condition is true. When the condition becomes false, execution comes out of the loop immediately, and the first statement after the while loop is executed. A while loop is a part of a control flow ...

  12. Python While Loops

    Example Get your own Python Server. Print i as long as i is less than 6: i = 1. while i < 6: print(i) i += 1. Try it Yourself ». Note: remember to increment i, or else the loop will continue forever. The while loop requires relevant variables to be ready, in this example we need to define an indexing variable, i, which we set to 1.

  13. Python Conditional Assignment (in 3 Ways)

    While working with lists, we often need to check if a list is empty or not, and if it is empty then we need to assign some default value to it. Let's see how we can do it using conditional assignment. my_list = [] # assigning default value to my_list if it is empty. my_list = my_list or [1, 2, 3] print(my_list) # output: [1, 2, 3] Here, we have ...

  14. 18 Python while Loop Examples and Exercises

    In Python programming, we use while loops to do a task a certain number of times repeatedly.The while loop checks a condition and executes the task as long as that condition is satisfied.The loop will stop its execution once the condition becomes not satisfied. The syntax of a while loop is as follows:

  15. PEP 572

    Unparenthesized assignment expressions are prohibited for the value of a keyword argument in a call. Example: foo(x = y := f(x)) # INVALID foo(x=(y := f(x))) # Valid, though probably confusing. This rule is included to disallow excessively confusing code, and because parsing keyword arguments is complex enough already.

  16. python

    0. The syntax for a while loop is "while condition ." The block beneath the while loop executes until either condition evaluates to False or a break command is executed. "while True" means the condition always is True and the loop won't stop unless a break is execute. It's a frequent python idiom used since python doesn't have a do while loop.

  17. 15 Python while Loop Exercises with Solutions for Beginners

    Current number: 1 Current number: 3 Current number: 5 Current number: 7 Current number: 9. In this code, we have a while loop that iterates from 1 to 10 (while num <= 10).Inside the loop, we check if the current value of num is even (using num % 2 == 0).. If num is even, we use the continue statement to skip the rest of the loop body and move to the next iteration.

  18. Conditional Statements in Python

    In the form shown above: <expr> is an expression evaluated in a Boolean context, as discussed in the section on Logical Operators in the Operators and Expressions in Python tutorial. <statement> is a valid Python statement, which must be indented. (You will see why very soon.) If <expr> is true (evaluates to a value that is "truthy"), then <statement> is executed.

  19. Assignment Operators in Python

    Python While Loops; Python Breaks; Python Continue Statement; Python Pass Statement; Python Functions; Python OOPS Concept; Python Data Structures. Python DSA; Linked List; Stack; Queue; ... Here, we will cover Assignment Operators in Python. So, Assignment Operators are used to assigning values to variables. Operator. Description.

  20. Python Exercises, Practice, Challenges

    These free exercises are nothing but Python assignments for the practice where you need to solve different programs and challenges. All exercises are tested on Python 3. Each exercise has 10-20 Questions. The solution is provided for every question. These Python programming exercises are suitable for all Python developers.

  21. python

    Particularly with the while loop, this can remove the need to have an infinite loop, an assignment, and a condition. It also creates a smooth parallel between a loop which simply uses a function call as its condition, and one which uses that as its condition but also uses the actual value.

  22. Demuxafy: improvement in droplet assignment by integrating multiple

    Recent innovations in single-cell RNA-sequencing (scRNA-seq) provide the technology to investigate biological questions at cellular resolution. Pooling cells from multiple individuals has become a common strategy, and droplets can subsequently be assigned to a specific individual by leveraging their inherent genetic differences. An implicit challenge with scRNA-seq is the occurrence of ...

  23. python

    The following example shows two equivalent ways to process a query result. The first uses fetchone() in a while loop, the second uses the cursor as an iterator: print(row) row = cursor.fetchone() print(row) Got it -- thanks for this very clear example and the two options. The second option is pretty neat, I had never seen that style done before.